Last active
August 29, 2015 14:06
-
-
Save rokroskar/bd8fb680b0e8a8ffbae6 to your computer and use it in GitHub Desktop.
example sparkgrams package usage
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
{ | |
"metadata": { | |
"name": "", | |
"signature": "sha256:af6593799dadc0dd1fe438bdf0178cce6a1a929730acd012ecf3deeca5c0310b" | |
}, | |
"nbformat": 3, | |
"nbformat_minor": 0, | |
"worksheets": [ | |
{ | |
"cells": [ | |
{ | |
"cell_type": "code", | |
"collapsed": false, | |
"input": [ | |
"%matplotlib inline\n", | |
"import matplotlib.pylab as plt\n", | |
"plt.rcParams['figure.figsize'] = (10,6)\n", | |
"plt.rcParams['font.size'] = 18" | |
], | |
"language": "python", | |
"metadata": {}, | |
"outputs": [], | |
"prompt_number": 1 | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": {}, | |
"source": [ | |
"Before doing anything else, make sure you can ssh into your own machine without a password. Executing this cell should print 'woohoo!!':" | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"collapsed": false, | |
"input": [ | |
"%%bash\n", | |
"ssh localhost echo \"woohoo!\"" | |
], | |
"language": "python", | |
"metadata": {}, | |
"outputs": [ | |
{ | |
"output_type": "stream", | |
"stream": "stdout", | |
"text": [ | |
"woohoo!\n" | |
] | |
}, | |
{ | |
"output_type": "stream", | |
"stream": "stderr", | |
"text": [ | |
"X11 forwarding request failed on channel 0\r\n" | |
] | |
} | |
], | |
"prompt_number": 2 | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": {}, | |
"source": [ | |
"If it doesn't, [set up ssh keys](https://www.google.com/webhp?sourceid=chrome-instant&ion=1&espv=2&ie=UTF-8#safe=off&q=set%20up%20ssh%20keys) first and then make sure the contents of your own public key (e.g. ``~/.ssh/id_rsa.pub``) are in ``~/.ssh/authorized_keys``. " | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": {}, | |
"source": [ | |
"#Example using Apache Spark to process Bloomberg case data" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": {}, | |
"source": [ | |
"[Apache Spark](http://spark.apache.org) is a distributed computing runtime system that facilitates the processing of large amounts of data across many machines. See the [documentation](http://spark.apache.org/docs/latest/) for some examples and info. " | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": {}, | |
"source": [ | |
"This notebook will show some example usage of the ``bloomberg_ngrams`` code that can run locally for testing and development before migrating to a larger cluster environment. " | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": {}, | |
"source": [ | |
"First, we initialize some environment variables -- modify these to fit your needs:" | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"collapsed": false, | |
"input": [ | |
"import os\n", | |
"homedir = os.environ['HOME']\n", | |
"os.environ['SPARK_HOME'] = '%s/spark'%homedir\n", | |
"spark_home = os.environ['SPARK_HOME']" | |
], | |
"language": "python", | |
"metadata": {}, | |
"outputs": [], | |
"prompt_number": 15 | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": {}, | |
"source": [ | |
"First make sure that if there are processes running from before, we can shut them down:" | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"collapsed": false, | |
"input": [ | |
"%%bash\n", | |
"~/spark/sbin/stop-all.sh" | |
], | |
"language": "python", | |
"metadata": {}, | |
"outputs": [ | |
{ | |
"output_type": "stream", | |
"stream": "stdout", | |
"text": [ | |
"localhost: X11 forwarding request failed on channel 0\r\n", | |
"localhost: stopping org.apache.spark.deploy.worker.Worker\n", | |
"stopping org.apache.spark.deploy.master.Master\n" | |
] | |
} | |
], | |
"prompt_number": 4 | |
}, | |
{ | |
"cell_type": "code", | |
"collapsed": false, | |
"input": [ | |
"%%bash\n", | |
"~/spark/sbin/start-all.sh " | |
], | |
"language": "python", | |
"metadata": {}, | |
"outputs": [ | |
{ | |
"output_type": "stream", | |
"stream": "stdout", | |
"text": [ | |
"starting org.apache.spark.deploy.master.Master, logging to /Users/rokstar/src/spark-1.0.1-bin-hadoop1/sbin/../logs/spark-rokstar-org.apache.spark.deploy.master.Master-1-public-docking-als-0453.ethz.ch.out\n", | |
"localhost: X11 forwarding request failed on channel 0\r\n", | |
"localhost: starting org.apache.spark.deploy.worker.Worker, logging to /Users/rokstar/src/spark-1.0.1-bin-hadoop1/sbin/../logs/spark-rokstar-org.apache.spark.deploy.worker.Worker-1-public-docking-als-0453.ethz.ch.out\n" | |
] | |
} | |
], | |
"prompt_number": 5 | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": {}, | |
"source": [ | |
"This will start the spark runtime on the local machine using all available cores. \n", | |
"\n", | |
"Once the spark runtime is going, we can start the analysis" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": {}, | |
"source": [ | |
"##Initialize the Spark Context" | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"collapsed": false, | |
"input": [ | |
"import os\n", | |
"master_url = 'spark://%s:7077'%os.environ['HOST']\n", | |
"print master_url" | |
], | |
"language": "python", | |
"metadata": {}, | |
"outputs": [ | |
{ | |
"output_type": "stream", | |
"stream": "stdout", | |
"text": [ | |
"spark://public-docking-als-0453.ethz.ch:7077\n" | |
] | |
} | |
], | |
"prompt_number": 6 | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": {}, | |
"source": [ | |
"add the spark python library to python path" | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"collapsed": false, | |
"input": [ | |
"import sys\n", | |
"sys.path.insert(0,os.environ['SPARK_HOME']+'/python')\n", | |
"sys.path.insert(0,os.environ['SPARK_HOME']+'/python/lib/py4j-0.8.1-src.zip')" | |
], | |
"language": "python", | |
"metadata": {}, | |
"outputs": [], | |
"prompt_number": 7 | |
}, | |
{ | |
"cell_type": "code", | |
"collapsed": false, | |
"input": [ | |
"import pyspark\n", | |
"from pyspark import SparkContext" | |
], | |
"language": "python", | |
"metadata": {}, | |
"outputs": [], | |
"prompt_number": 8 | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": {}, | |
"source": [ | |
"We need to set the executor memory, i.e. how much memory each worker is allowed to use. By default this is something small like 0.5 Gb. Here it is adjusted to 4Gb, but set this to whatever is appropriate for your machine." | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"collapsed": false, | |
"input": [ | |
"os.environ['SPARK_EXECUTOR_MEMORY'] = '4g'\n", | |
"try : \n", | |
" sc.stop()\n", | |
"except : \n", | |
" pass\n", | |
"\n", | |
"sc = SparkContext(master = master_url, appName = 'Case Notebook', batchSize=10)" | |
], | |
"language": "python", | |
"metadata": {}, | |
"outputs": [], | |
"prompt_number": 9 | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": {}, | |
"source": [ | |
"Now we've got a ``SparkContext`` initialized, allowing us to distribute the computation across the set of compute nodes. " | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": {}, | |
"source": [ | |
"##Start analyzing case data" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": {}, | |
"source": [ | |
"Set up some constants: " | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"collapsed": false, | |
"input": [ | |
"import bloomberg_ngrams, sparkgram\n", | |
"data_path = '%s/project_data/decade_data'%homedir\n", | |
"JUDGE_BIO_PATH = data_path+'/inputs/bio_database_app_and_dist.csv'\n", | |
"CASE_DB_PATH = data_path+'/inputs/BloombergCASELEVEL_Touse.csv'\n", | |
"from sklearn import feature_extraction\n", | |
"sw = feature_extraction.text.ENGLISH_STOP_WORDS" | |
], | |
"language": "python", | |
"metadata": {}, | |
"outputs": [], | |
"prompt_number": 16 | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": {}, | |
"source": [ | |
"We create a ``SparkBloombergCaseVectorizer`` object that will process the cases from one year of cases. " | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"collapsed": false, | |
"input": [ | |
"year_range = ['1900']" | |
], | |
"language": "python", | |
"metadata": {}, | |
"outputs": [], | |
"prompt_number": 17 | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": {}, | |
"source": [ | |
"Now we can initialize the ``SparkBloombergCaseVectorizer``.\n", | |
"\n", | |
"It is highly recommended to specify the ``numPartitions`` to be something like 4-5 times the number of cores. This splits up the shuffling tasks that can otherwise lead to memory problems. If the spark jobs start dying off with obscure memory or IO errors, try increasing the number of partitions. " | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"collapsed": false, | |
"input": [ | |
"reload(sparkgram)\n", | |
"reload(sparkgram.document_vectorizer)\n", | |
"reload(bloomberg_ngrams)\n", | |
"\n", | |
"cv = bloomberg_ngrams.SparkBloombergCaseVectorizer(sc, year_range, JUDGE_BIO_PATH, CASE_DB_PATH, datadir = data_path,\n", | |
" ngram_range=[1,3], stop_words = sw, num_partitions=sc.defaultParallelism*4)" | |
], | |
"language": "python", | |
"metadata": {}, | |
"outputs": [], | |
"prompt_number": 151 | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": {}, | |
"source": [ | |
"Now lets do the processing step by step to see how long each stage takes. First, construct the ``doc_rdd`` which reads all the case files and extracts the text: " | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"collapsed": false, | |
"input": [ | |
"%time cv.load_text() " | |
], | |
"language": "python", | |
"metadata": {}, | |
"outputs": [ | |
{ | |
"output_type": "stream", | |
"stream": "stdout", | |
"text": [ | |
"CPU times: user 17.3 ms, sys: 5.24 ms, total: 22.6 ms\n", | |
"Wall time: 29.5 s\n" | |
] | |
} | |
], | |
"prompt_number": 152 | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": {}, | |
"source": [ | |
"You can check on the [job stage timing](http://localhost:4040/stages/) and [cached storage](http://localhost:4040/storage/) at any time to see the jobs' progress." | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": {}, | |
"source": [ | |
"The ``doc_rdd`` is now a collection of ``(context,text)`` pairs:" | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"collapsed": false, | |
"input": [ | |
"cv.doc_rdd.first()" | |
], | |
"language": "python", | |
"metadata": {}, | |
"outputs": [ | |
{ | |
"metadata": {}, | |
"output_type": "pyout", | |
"prompt_number": 154, | |
"text": [ | |
"('XFKBB2||05feb1900||Insurance Law||1900||9||GILBERT, WILLIAM B.||MORROW, WILLIAM W.||ROSS, ERSKINE M.||MORROW, WILLIAM W.||MORROW||0||0||1',\n", | |
" \". this is an action upon a fire insurance policy, brought by the plaintiff, james f. mcelroy, in the superior court of the state of washington, to recover the sum of $2,530.85, with interest, alleged to be due upon a policy of fire insurance issued to mrs. j. c. powers, mcelroy's assignor, by the palatine insurance company, insuring the steamer cricket, her hull, cabins, tackle, furniture, etc., against loss or damage by fire to the extent of $3,500. upon the petition of the defendant insurance company the case was removed to the circuit court of the united states for the district of washington, where it was tried before a jury. a verdict was returned in favor of mcelroy for the full amount of his demand. motion for a new trial was denied, and judgment entered in favor of the plaintiff, to reverse which the defendant sued out this writ of error. the case of mcelroy v. british america assur. co. , reported in 36 c. c. a. 615 , 94 fed. 990 , was a companion case to this, the plaintiff being the same, and the action brought to recover on a policy issued by the defendant company upon the same risk, at the same time, and under substantially the same circumstances, except that the insurance against loss or damage by fire in that case was to the extent of $3,000. the negotiations leading to the issuance of the policies were had between the same parties in each case, and a detailed statement of the material facts will be found in the report of the british america assurance company case , supra . the two cases come before this court upon practically the same record, the only distinction to be noticed in the legal aspect of the two cases arising upon the assignments of error. in the case against the british america assurance company it was contended on the part of the defendant as a matter of law that the policy was void for the reason that it in express terms provided that, if the property be or become incumbered by a chattel mortgage, or if insurance should be obtained in excess of $6,500 in all concurrent with the amount covered by the policy, it should be void; and both of such forbidden acts on the part of the plaintiff had been established by the evidence. there was, however, on the other hand, evidence tending to show that calhoun & co., a firm of insurance agents and brokers in seattle, who negotiated the insurance in the sum of $6,500, had notice and knowledge of the existence of a chattel mortgage, and of the intention of the insured to secure further insurance in the amount of $3,500 to cover the interest of the holder of that mortgage; that calhoun & co. endeavored to secure this insurance also, but it was placed elsewhere. there was also evidence that calhoun & co., having secured the application for the contract of insurance in the sum of $6,500, placed $3,000 with an agent of the british america assurance company, the defendant in that case, and $3,500 with the agents of the palatine insurance company, the defendant in this case; that calhoun & co. received the written policies in the amounts stated from the agents of the companies and delivered these policies to the agent of the insured, and at the same time collected a portion of the premium. upon the evidence of these facts it was contended on the part of the plaintiff that calhoun & co. were the agents of the defendant in the transaction, and had so dealt with the agent of the insured, in securing the contract of insurance and in the delivery of the policy in that case to the agent of the insured, as to bind themselves and the insurance company by way of estoppel not to dispute the validity of the policy on account of the conditions. to this claim the defendant replied that calhoun & co. were not its agents, but the agents of the insured, and therefore any notice or knowledge which they may have had concerning the mortgage and the excess of concurrent insurance was not a notice to or the knowledge of the defendant. the trial court, at the close of the testimony, instructed the jury to return a verdict for the insurance company, which was accordingly done, and judgment entered for the defendant. the case was brought before this court upon the single assignment of error that the court erred in giving such peremptory instruction, and the only question there presented was whether, upon the testimony introduced, plaintiff had the right to have the case submitted to the jury. it was held by this court that the evidence was sufficient to go to the jury, and the case was remanded, with instructions to grant a new trial. the case at bar was the first of the two cases to be heard in the trial court. the issues were the same, except that in the present case the fact that the property was incumbered with a chattel mortgage, without being provided for by an agreement indorsed on the policy, was not made a ground of defense. the evidence upon the remaining issues was submitted to the jury, and a verdict rendered for the plaintiff (defendant in error).\\n the first error assigned relates to the refusal of the court to instruct the jury to find a verdict for the defendant (plaintiff in error). as the evidence in this case is substantially the same as the evidence in the british america assurance case , that question has been considered and determined by this court, and requires no further discussion.\\n the remaining assignments of error relate to instructions given and instructions refused by the court concerning the question of agency. they involve the question whether, upon the evidence in the case, the knowledge of calhoun & co. as to the excessive insurance could be imputed to the company, and notice to them be considered notice to the company; and whether the acts of calhoun & co. in dealing with the agent of the insured in securing the contract of insurance and in delivering the policy to the agent of the insured bound themselves and the insurance company by way of estoppel not to deny the validity of the policy by reason of the alleged breach of condition. these questions were fully discussed and passed upon in the other case, and in accordance with the views there expressed the instructions complained of cannot be held to have been error. the judgment of the circuit court is affirmed. \\n\")" | |
] | |
} | |
], | |
"prompt_number": 154 | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": {}, | |
"source": [ | |
"Extract the ngrams from each case text: " | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"collapsed": false, | |
"input": [ | |
"cv.ngram_rdd.cache()\n", | |
"%time cv.ngram_rdd.count()" | |
], | |
"language": "python", | |
"metadata": {}, | |
"outputs": [ | |
{ | |
"output_type": "stream", | |
"stream": "stdout", | |
"text": [ | |
"CPU times: user 7.12 ms, sys: 1.75 ms, total: 8.87 ms\n", | |
"Wall time: 4.42 s\n" | |
] | |
}, | |
{ | |
"metadata": {}, | |
"output_type": "pyout", | |
"prompt_number": 155, | |
"text": [ | |
"695" | |
] | |
} | |
], | |
"prompt_number": 155 | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": {}, | |
"source": [ | |
"Each document is now represented as a tuple of ``(context, ngram)`` where ``ngram`` = ``(ngram_string, count)``. Therefore, if we want to construct a vocabulary or extract the feature vectors we just need to slice this data in whatever way is needed. " | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"collapsed": false, | |
"input": [ | |
"cv.ngram_rdd.first()" | |
], | |
"language": "python", | |
"metadata": {}, | |
"outputs": [ | |
{ | |
"metadata": {}, | |
"output_type": "pyout", | |
"prompt_number": 156, | |
"text": [ | |
"('XFKBB2||05feb1900||Insurance Law||1900||9||GILBERT, WILLIAM B.||MORROW, WILLIAM W.||ROSS, ERSKINE M.||MORROW, WILLIAM W.||MORROW||0||0||1',\n", | |
" [('accordance', 1),\n", | |
" ('accordance views', 1),\n", | |
" ('accordance views expressed', 1),\n", | |
" ('accordingly', 1),\n", | |
" ('accordingly judgment', 1),\n", | |
" ('accordingly judgment entered', 1),\n", | |
" ('account', 1),\n", | |
" ('account conditions', 1),\n", | |
" ('account conditions claim', 1),\n", | |
" ('action', 2),\n", | |
" ('action brought', 1),\n", | |
" ('action brought recover', 1),\n", | |
" ('action insurance', 1),\n", | |
" ('action insurance policy', 1),\n", | |
" ('acts', 2),\n", | |
" ('acts calhoun', 1),\n", | |
" ('acts calhoun dealing', 1),\n", | |
" ('acts plaintiff', 1),\n", | |
" ('acts plaintiff established', 1),\n", | |
" ('affirmed', 1),\n", | |
" ('agency', 1),\n", | |
" ('agency involve', 1),\n", | |
" ('agency involve question', 1),\n", | |
" ('agent', 6),\n", | |
" ('agent british', 1),\n", | |
" ('agent british america', 1),\n", | |
" ('agent insured', 5),\n", | |
" ('agent insured bind', 1),\n", | |
" ('agent insured bound', 1),\n", | |
" ('agent insured securing', 2),\n", | |
" ('agent insured time', 1),\n", | |
" ('agents', 6),\n", | |
" ('agents agents', 1),\n", | |
" ('agents agents insured', 1),\n", | |
" ('agents brokers', 1),\n", | |
" ('agents brokers seattle', 1),\n", | |
" ('agents companies', 1),\n", | |
" ('agents companies delivered', 1),\n", | |
" ('agents defendant', 1),\n", | |
" ('agents defendant transaction', 1),\n", | |
" ('agents insured', 1),\n", | |
" ('agents insured notice', 1),\n", | |
" ('agents palatine', 1),\n", | |
" ('agents palatine insurance', 1),\n", | |
" ('agreement', 1),\n", | |
" ('agreement indorsed', 1),\n", | |
" ('agreement indorsed policy', 1),\n", | |
" ('alleged', 2),\n", | |
" ('alleged breach', 1),\n", | |
" ('alleged breach condition', 1),\n", | |
" ('alleged policy', 1),\n", | |
" ('alleged policy insurance', 1),\n", | |
" ('america', 5),\n", | |
" ('america assur', 1),\n", | |
" ('america assur reported', 1),\n", | |
" ('america assurance', 4),\n", | |
" ('america assurance case', 1),\n", | |
" ('america assurance company', 3),\n", | |
" ('amounts', 1),\n", | |
" ('amounts stated', 1),\n", | |
" ('amounts stated agents', 1),\n", | |
" ('application', 1),\n", | |
" ('application contract', 1),\n", | |
" ('application contract insurance', 1),\n", | |
" ('arising', 1),\n", | |
" ('arising assignments', 1),\n", | |
" ('arising assignments error', 1),\n", | |
" ('aspect', 1),\n", | |
" ('aspect cases', 1),\n", | |
" ('aspect cases arising', 1),\n", | |
" ('assigned', 1),\n", | |
" ('assigned relates', 1),\n", | |
" ('assigned relates refusal', 1),\n", | |
" ('assignment', 1),\n", | |
" ('assignment error', 1),\n", | |
" ('assignment error court', 1),\n", | |
" ('assignments', 2),\n", | |
" ('assignments error', 2),\n", | |
" ('assignments error case', 1),\n", | |
" ('assignments error relate', 1),\n", | |
" ('assignor', 1),\n", | |
" ('assignor palatine', 1),\n", | |
" ('assignor palatine insurance', 1),\n", | |
" ('assur', 1),\n", | |
" ('assur reported', 1),\n", | |
" ('assur reported c', 1),\n", | |
" ('assurance', 4),\n", | |
" ('assurance case', 1),\n", | |
" ('assurance case question', 1),\n", | |
" ('assurance company', 3),\n", | |
" ('assurance company case', 1),\n", | |
" ('assurance company contended', 1),\n", | |
" ('assurance company defendant', 1),\n", | |
" ('bar', 1),\n", | |
" ('bar cases', 1),\n", | |
" ('bar cases heard', 1),\n", | |
" ('bind', 1),\n", | |
" ('bind insurance', 1),\n", | |
" ('bind insurance company', 1),\n", | |
" ('bound', 1),\n", | |
" ('bound insurance', 1),\n", | |
" ('bound insurance company', 1),\n", | |
" ('breach', 1),\n", | |
" ('breach condition', 1),\n", | |
" ('breach condition questions', 1),\n", | |
" ('british', 5),\n", | |
" ('british america', 5),\n", | |
" ('british america assur', 1),\n", | |
" ('british america assurance', 4),\n", | |
" ('brokers', 1),\n", | |
" ('brokers seattle', 1),\n", | |
" ('brokers seattle negotiated', 1),\n", | |
" ('brought', 3),\n", | |
" ('brought court', 1),\n", | |
" ('brought court single', 1),\n", | |
" ('brought plaintiff', 1),\n", | |
" ('brought plaintiff james', 1),\n", | |
" ('brought recover', 1),\n", | |
" ('brought recover policy', 1),\n", | |
" ('c', 3),\n", | |
" ('c c', 1),\n", | |
" ('c c fed', 1),\n", | |
" ('c fed', 1),\n", | |
" ('c fed companion', 1),\n", | |
" ('c powers', 1),\n", | |
" (\"c powers mcelroy's\", 1),\n", | |
" ('cabins', 1),\n", | |
" ('cabins tackle', 1),\n", | |
" ('cabins tackle furniture', 1),\n", | |
" ('calhoun', 8),\n", | |
" ('calhoun agents', 2),\n", | |
" ('calhoun agents agents', 1),\n", | |
" ('calhoun agents defendant', 1),\n", | |
" ('calhoun dealing', 1),\n", | |
" ('calhoun dealing agent', 1),\n", | |
" ('calhoun endeavored', 1),\n", | |
" ('calhoun endeavored secure', 1),\n", | |
" ('calhoun excessive', 1),\n", | |
" ('calhoun excessive insurance', 1),\n", | |
" ('calhoun firm', 1),\n", | |
" ('calhoun firm insurance', 1),\n", | |
" ('calhoun having', 1),\n", | |
" ('calhoun having secured', 1),\n", | |
" ('calhoun received', 1),\n", | |
" ('calhoun received written', 1),\n", | |
" ('case', 19),\n", | |
" ('case accordance', 1),\n", | |
" ('case accordance views', 1),\n", | |
" ('case agent', 1),\n", | |
" ('case agent insured', 1),\n", | |
" ('case agents', 1),\n", | |
" ('case agents palatine', 1),\n", | |
" ('case bar', 1),\n", | |
" ('case bar cases', 1),\n", | |
" ('case british', 1),\n", | |
" ('case british america', 1),\n", | |
" ('case brought', 1),\n", | |
" ('case brought court', 1),\n", | |
" ('case calhoun', 1),\n", | |
" ('case calhoun received', 1),\n", | |
" ('case detailed', 1),\n", | |
" ('case detailed statement', 1),\n", | |
" ('case extent', 1),\n", | |
" ('case extent negotiations', 1),\n", | |
" ('case fact', 1),\n", | |
" ('case fact property', 1),\n", | |
" ('case knowledge', 1),\n", | |
" ('case knowledge calhoun', 1),\n", | |
" ('case mcelroy', 1),\n", | |
" ('case mcelroy v', 1),\n", | |
" ('case plaintiff', 1),\n", | |
" ('case plaintiff action', 1),\n", | |
" ('case question', 1),\n", | |
" ('case question considered', 1),\n", | |
" ('case remanded', 1),\n", | |
" ('case remanded instructions', 1),\n", | |
" ('case removed', 1),\n", | |
" ('case removed circuit', 1),\n", | |
" ('case submitted', 1),\n", | |
" ('case submitted jury', 1),\n", | |
" ('case substantially', 1),\n", | |
" ('case substantially evidence', 1),\n", | |
" ('case supra', 1),\n", | |
" ('case supra cases', 1),\n", | |
" ('cases', 3),\n", | |
" ('cases arising', 1),\n", | |
" ('cases arising assignments', 1),\n", | |
" ('cases come', 1),\n", | |
" ('cases come court', 1),\n", | |
" ('cases heard', 1),\n", | |
" ('cases heard trial', 1),\n", | |
" ('chattel', 3),\n", | |
" ('chattel mortgage', 3),\n", | |
" ('chattel mortgage insurance', 1),\n", | |
" ('chattel mortgage intention', 1),\n", | |
" ('chattel mortgage provided', 1),\n", | |
" ('circuit', 2),\n", | |
" ('circuit court', 2),\n", | |
" ('circuit court affirmed', 1),\n", | |
" ('circuit court united', 1),\n", | |
" ('circumstances', 1),\n", | |
" ('circumstances insurance', 1),\n", | |
" ('circumstances insurance loss', 1),\n", | |
" ('claim', 1),\n", | |
" ('claim defendant', 1),\n", | |
" ('claim defendant replied', 1),\n", | |
" ('close', 1),\n", | |
" ('close testimony', 1),\n", | |
" ('close testimony instructed', 1),\n", | |
" ('collected', 1),\n", | |
" ('collected portion', 1),\n", | |
" ('collected portion premium', 1),\n", | |
" ('come', 1),\n", | |
" ('come court', 1),\n", | |
" ('come court practically', 1),\n", | |
" ('companies', 1),\n", | |
" ('companies delivered', 1),\n", | |
" ('companies delivered policies', 1),\n", | |
" ('companion', 1),\n", | |
" ('companion case', 1),\n", | |
" ('companion case plaintiff', 1),\n", | |
" ('company', 12),\n", | |
" ('company accordingly', 1),\n", | |
" ('company accordingly judgment', 1),\n", | |
" ('company acts', 1),\n", | |
" ('company acts calhoun', 1),\n", | |
" ('company case', 2),\n", | |
" ('company case removed', 1),\n", | |
" ('company case supra', 1),\n", | |
" ('company contended', 1),\n", | |
" ('company contended defendant', 1),\n", | |
" ('company defendant', 2),\n", | |
" ('company defendant case', 2),\n", | |
" ('company insuring', 1),\n", | |
" ('company insuring steamer', 1),\n", | |
" ('company notice', 1),\n", | |
" ('company notice considered', 1),\n", | |
" ('company risk', 1),\n", | |
" ('company risk time', 1),\n", | |
" ('company way', 2),\n", | |
" ('company way estoppel', 2),\n", | |
" ('complained', 1),\n", | |
" ('complained held', 1),\n", | |
" ('complained held error', 1),\n", | |
" ('concerning', 2),\n", | |
" ('concerning mortgage', 1),\n", | |
" ('concerning mortgage excess', 1),\n", | |
" ('concerning question', 1),\n", | |
" ('concerning question agency', 1),\n", | |
" ('concurrent', 2),\n", | |
" ('concurrent covered', 1),\n", | |
" ('concurrent covered policy', 1),\n", | |
" ('concurrent insurance', 1),\n", | |
" ('concurrent insurance notice', 1),\n", | |
" ('condition', 1),\n", | |
" ('condition questions', 1),\n", | |
" ('condition questions fully', 1),\n", | |
" ('conditions', 1),\n", | |
" ('conditions claim', 1),\n", | |
" ('conditions claim defendant', 1),\n", | |
" ('considered', 2),\n", | |
" ('considered determined', 1),\n", | |
" ('considered determined court', 1),\n", | |
" ('considered notice', 1),\n", | |
" ('considered notice company', 1),\n", | |
" ('contended', 2),\n", | |
" ('contended defendant', 1),\n", | |
" ('contended defendant matter', 1),\n", | |
" ('contended plaintiff', 1),\n", | |
" ('contended plaintiff calhoun', 1),\n", | |
" ('contract', 3),\n", | |
" ('contract insurance', 3),\n", | |
" ('contract insurance delivering', 1),\n", | |
" ('contract insurance delivery', 1),\n", | |
" ('contract insurance sum', 1),\n", | |
" ('court', 12),\n", | |
" ('court affirmed', 1),\n", | |
" ('court close', 1),\n", | |
" ('court close testimony', 1),\n", | |
" ('court concerning', 1),\n", | |
" ('court concerning question', 1),\n", | |
" ('court erred', 1),\n", | |
" ('court erred giving', 1),\n", | |
" ('court evidence', 1),\n", | |
" ('court evidence sufficient', 1),\n", | |
" ('court instruct', 1),\n", | |
" ('court instruct jury', 1),\n", | |
" ('court issues', 1),\n", | |
" ('court issues present', 1),\n", | |
" ('court practically', 1),\n", | |
" ('court practically record', 1),\n", | |
" ('court requires', 1),\n", | |
" ('court requires discussion', 1),\n", | |
" ('court single', 1),\n", | |
" ('court single assignment', 1),\n", | |
" ('court state', 1),\n", | |
" ('court state washington', 1),\n", | |
" ('court united', 1),\n", | |
" ('court united states', 1),\n", | |
" ('cover', 1),\n", | |
" ('cover holder', 1),\n", | |
" ('cover holder mortgage', 1),\n", | |
" ('covered', 1),\n", | |
" ('covered policy', 1),\n", | |
" ('covered policy void', 1),\n", | |
" ('cricket', 1),\n", | |
" ('cricket hull', 1),\n", | |
" ('cricket hull cabins', 1),\n", | |
" ('damage', 2),\n", | |
" ('damage case', 1),\n", | |
" ('damage case extent', 1),\n", | |
" ('damage extent', 1),\n", | |
" ('damage extent petition', 1),\n", | |
" ('dealing', 1),\n", | |
" ('dealing agent', 1),\n", | |
" ('dealing agent insured', 1),\n", | |
" ('dealt', 1),\n", | |
" ('dealt agent', 1),\n", | |
" ('dealt agent insured', 1),\n", | |
" ('defendant', 12),\n", | |
" ('defendant case', 3),\n", | |
" ('defendant case agents', 1),\n", | |
" ('defendant case brought', 1),\n", | |
" ('defendant case calhoun', 1),\n", | |
" ('defendant company', 1),\n", | |
" ('defendant company risk', 1),\n", | |
" ('defendant error', 1),\n", | |
" ('defendant error error', 1),\n", | |
" ('defendant insurance', 1),\n", | |
" ('defendant insurance company', 1),\n", | |
" ('defendant matter', 1),\n", | |
" ('defendant matter law', 1),\n", | |
" ('defendant plaintiff', 1),\n", | |
" ('defendant plaintiff error', 1),\n", | |
" ('defendant replied', 1),\n", | |
" ('defendant replied calhoun', 1),\n", | |
" ('defendant sued', 1),\n", | |
" ('defendant sued writ', 1),\n", | |
" ('defendant transaction', 1),\n", | |
" ('defendant transaction dealt', 1),\n", | |
" ('defendant trial', 1),\n", | |
" ('defendant trial court', 1),\n", | |
" ('defense', 1),\n", | |
" ('defense evidence', 1),\n", | |
" ('defense evidence remaining', 1),\n", | |
" ('delivered', 1),\n", | |
" ('delivered policies', 1),\n", | |
" ('delivered policies agent', 1),\n", | |
" ('delivering', 1),\n", | |
" ('delivering policy', 1),\n", | |
" ('delivering policy agent', 1),\n", | |
" ('delivery', 1),\n", | |
" ('delivery policy', 1),\n", | |
" ('delivery policy case', 1),\n", | |
" ('demand', 1),\n", | |
" ('demand motion', 1),\n", | |
" ('demand motion new', 1),\n", | |
" ('denied', 1),\n", | |
" ('denied judgment', 1),\n", | |
" ('denied judgment entered', 1),\n", | |
" ('deny', 1),\n", | |
" ('deny validity', 1),\n", | |
" ('deny validity policy', 1),\n", | |
" ('detailed', 1),\n", | |
" ('detailed statement', 1),\n", | |
" ('detailed statement material', 1),\n", | |
" ('determined', 1),\n", | |
" ('determined court', 1),\n", | |
" ('determined court requires', 1),\n", | |
" ('discussed', 1),\n", | |
" ('discussed passed', 1),\n", | |
" ('discussed passed case', 1),\n", | |
" ('discussion', 1),\n", | |
" ('discussion remaining', 1),\n", | |
" ('discussion remaining assignments', 1),\n", | |
" ('dispute', 1),\n", | |
" ('dispute validity', 1),\n", | |
" ('dispute validity policy', 1),\n", | |
" ('distinction', 1),\n", | |
" ('distinction noticed', 1),\n", | |
" ('distinction noticed legal', 1),\n", | |
" ('district', 1),\n", | |
" ('district washington', 1),\n", | |
" ('district washington tried', 1),\n", | |
" ('endeavored', 1),\n", | |
" ('endeavored secure', 1),\n", | |
" ('endeavored secure insurance', 1),\n", | |
" ('entered', 2),\n", | |
" ('entered defendant', 1),\n", | |
" ('entered defendant case', 1),\n", | |
" ('entered favor', 1),\n", | |
" ('entered favor plaintiff', 1),\n", | |
" ('erred', 1),\n", | |
" ('erred giving', 1),\n", | |
" ('erred giving peremptory', 1),\n", | |
" ('error', 8),\n", | |
" ('error assigned', 1),\n", | |
" ('error assigned relates', 1),\n", | |
" ('error case', 2),\n", | |
" ('error case british', 1),\n", | |
" ('error case mcelroy', 1),\n", | |
" ('error court', 1),\n", | |
" ('error court erred', 1),\n", | |
" ('error error', 1),\n", | |
" ('error error assigned', 1),\n", | |
" ('error evidence', 1),\n", | |
" ('error evidence case', 1),\n", | |
" ('error judgment', 1),\n", | |
" ('error judgment circuit', 1),\n", | |
" ('error relate', 1),\n", | |
" ('error relate instructions', 1),\n", | |
" ('established', 1),\n", | |
" ('established evidence', 1),\n", | |
" ('established evidence hand', 1),\n", | |
" ('estoppel', 2),\n", | |
" ('estoppel deny', 1),\n", | |
" ('estoppel deny validity', 1),\n", | |
" ('estoppel dispute', 1),\n", | |
" ('estoppel dispute validity', 1),\n", | |
" ('evidence', 9),\n", | |
" ('evidence british', 1),\n", | |
" ('evidence british america', 1),\n", | |
" ('evidence calhoun', 1),\n", | |
" ('evidence calhoun having', 1),\n", | |
" ('evidence case', 2),\n", | |
" ('evidence case knowledge', 1),\n", | |
" ('evidence case substantially', 1),\n", | |
" ('evidence facts', 1),\n", | |
" ('evidence facts contended', 1),\n", | |
" ('evidence hand', 1),\n", | |
" ('evidence hand evidence', 1),\n", | |
" ('evidence remaining', 1),\n", | |
" ('evidence remaining issues', 1),\n", | |
" ('evidence sufficient', 1),\n", | |
" ('evidence sufficient jury', 1),\n", | |
" ('evidence tending', 1),\n", | |
" ('evidence tending calhoun', 1),\n", | |
" ('excess', 2),\n", | |
" ('excess concurrent', 2),\n", | |
" ('excess concurrent covered', 1),\n", | |
" ('excess concurrent insurance', 1),\n", | |
" ('excessive', 1),\n", | |
" ('excessive insurance', 1),\n", | |
" ('excessive insurance imputed', 1),\n", | |
" ('existence', 1),\n", | |
" ('existence chattel', 1),\n", | |
" ('existence chattel mortgage', 1),\n", | |
" ('express', 1),\n", | |
" ('express terms', 1),\n", | |
" ('express terms provided', 1),\n", | |
" ('expressed', 1),\n", | |
" ('expressed instructions', 1),\n", | |
" ('expressed instructions complained', 1),\n", | |
" ('extent', 2),\n", | |
" ('extent negotiations', 1),\n", | |
" ('extent negotiations leading', 1),\n", | |
" ('extent petition', 1),\n", | |
" ('extent petition defendant', 1),\n", | |
" ('f', 1),\n", | |
" ('f mcelroy', 1),\n", | |
" ('f mcelroy superior', 1),\n", | |
" ('fact', 1),\n", | |
" ('fact property', 1),\n", | |
" ('fact property incumbered', 1),\n", | |
" ('facts', 2),\n", | |
" ('facts contended', 1),\n", | |
" ('facts contended plaintiff', 1),\n", | |
" ('facts report', 1),\n", | |
" ('facts report british', 1),\n", | |
" ('favor', 2),\n", | |
" ('favor mcelroy', 1),\n", | |
" ('favor mcelroy demand', 1),\n", | |
" ('favor plaintiff', 1),\n", | |
" ('favor plaintiff reverse', 1),\n", | |
" ('fed', 1),\n", | |
" ('fed companion', 1),\n", | |
" ('fed companion case', 1),\n", | |
" ('firm', 1),\n", | |
" ('firm insurance', 1),\n", | |
" ('firm insurance agents', 1),\n", | |
" ('forbidden', 1),\n", | |
" ('forbidden acts', 1),\n", | |
" ('forbidden acts plaintiff', 1),\n", | |
" ('fully', 1),\n", | |
" ('fully discussed', 1),\n", | |
" ('fully discussed passed', 1),\n", | |
" ('furniture', 1),\n", | |
" ('furniture loss', 1),\n", | |
" ('furniture loss damage', 1),\n", | |
" ('given', 1),\n", | |
" ('given instructions', 1),\n", | |
" ('given instructions refused', 1),\n", | |
" ('giving', 1),\n", | |
" ('giving peremptory', 1),\n", | |
" ('giving peremptory instruction', 1),\n", | |
" ('grant', 1),\n", | |
" ('grant new', 1),\n", | |
" ('grant new trial', 1),\n", | |
" ('ground', 1),\n", | |
" ('ground defense', 1),\n", | |
" ('ground defense evidence', 1),\n", | |
" ('hand', 1),\n", | |
" ('hand evidence', 1),\n", | |
" ('hand evidence tending', 1),\n", | |
" ('having', 1),\n", | |
" ('having secured', 1),\n", | |
" ('having secured application', 1),\n", | |
" ('heard', 1),\n", | |
" ('heard trial', 1),\n", | |
" ('heard trial court', 1),\n", | |
" ('held', 2),\n", | |
" ('held court', 1),\n", | |
" ('held court evidence', 1),\n", | |
" ('held error', 1),\n", | |
" ('held error judgment', 1),\n", | |
" ('holder', 1),\n", | |
" ('holder mortgage', 1),\n", | |
" ('holder mortgage calhoun', 1),\n", | |
" ('hull', 1),\n", | |
" ('hull cabins', 1),\n", | |
" ('hull cabins tackle', 1),\n", | |
" ('imputed', 1),\n", | |
" ('imputed company', 1),\n", | |
" ('imputed company notice', 1),\n", | |
" ('incumbered', 2),\n", | |
" ('incumbered chattel', 2),\n", | |
" ('incumbered chattel mortgage', 2),\n", | |
" ('indorsed', 1),\n", | |
" ('indorsed policy', 1),\n", | |
" ('indorsed policy ground', 1),\n", | |
" ('instruct', 1),\n", | |
" ('instruct jury', 1),\n", | |
" ('instruct jury verdict', 1),\n", | |
" ('instructed', 1),\n", | |
" ('instructed jury', 1),\n", | |
" ('instructed jury return', 1),\n", | |
" ('instruction', 1),\n", | |
" ('instruction question', 1),\n", | |
" ('instruction question presented', 1),\n", | |
" ('instructions', 4),\n", | |
" ('instructions complained', 1),\n", | |
" ('instructions complained held', 1),\n", | |
" ('instructions given', 1),\n", | |
" ('instructions given instructions', 1),\n", | |
" ('instructions grant', 1),\n", | |
" ('instructions grant new', 1),\n", | |
" ('instructions refused', 1),\n", | |
" ('instructions refused court', 1),\n", | |
" ('insurance', 19),\n", | |
" ('insurance agents', 1),\n", | |
" ('insurance agents brokers', 1),\n", | |
" ('insurance company', 6),\n", | |
" ('insurance company accordingly', 1),\n", | |
" ('insurance company case', 1),\n", | |
" ('insurance company defendant', 1),\n", | |
" ('insurance company insuring', 1),\n", | |
" ('insurance company way', 2),\n", | |
" ('insurance cover', 1),\n", | |
" ('insurance cover holder', 1),\n", | |
" ('insurance delivering', 1),\n", | |
" ('insurance delivering policy', 1),\n", | |
" ('insurance delivery', 1),\n", | |
" ('insurance delivery policy', 1),\n", | |
" ('insurance imputed', 1),\n", | |
" ('insurance imputed company', 1),\n", | |
" ('insurance issued', 1),\n", | |
" ('insurance issued mrs', 1),\n", | |
" ('insurance loss', 1),\n", | |
" ('insurance loss damage', 1),\n", | |
" ('insurance notice', 1),\n", | |
" ('insurance notice knowledge', 1),\n", | |
" ('insurance obtained', 1),\n", | |
" ('insurance obtained excess', 1),\n", | |
" ('insurance placed', 1),\n", | |
" ('insurance placed evidence', 1),\n", | |
" ('insurance policy', 1),\n", | |
" ('insurance policy brought', 1),\n", | |
" ('insurance sum', 2),\n", | |
" ('insurance sum notice', 1),\n", | |
" ('insurance sum placed', 1),\n", | |
" ('insured', 7),\n", | |
" ('insured bind', 1),\n", | |
" ('insured bind insurance', 1),\n", | |
" ('insured bound', 1),\n", | |
" ('insured bound insurance', 1),\n", | |
" ('insured notice', 1),\n", | |
" ('insured notice knowledge', 1),\n", | |
" ('insured secure', 1),\n", | |
" ('insured secure insurance', 1),\n", | |
" ('insured securing', 2),\n", | |
" ('insured securing contract', 2),\n", | |
" ('insured time', 1),\n", | |
" ('insured time collected', 1),\n", | |
" ('insuring', 1),\n", | |
" ('insuring steamer', 1),\n", | |
" ('insuring steamer cricket', 1),\n", | |
" ('intention', 1),\n", | |
" ('intention insured', 1),\n", | |
" ('intention insured secure', 1),\n", | |
" ('introduced', 1),\n", | |
" ('introduced plaintiff', 1),\n", | |
" ('introduced plaintiff right', 1),\n", | |
" ('involve', 1),\n", | |
" ('involve question', 1),\n", | |
" ('involve question evidence', 1),\n", | |
" ('issuance', 1),\n", | |
" ('issuance policies', 1),\n", | |
" ('issuance policies parties', 1),\n", | |
" ('issued', 2),\n", | |
" ('issued defendant', 1),\n", | |
" ('issued defendant company', 1),\n", | |
" ('issued mrs', 1),\n", | |
" ('issued mrs j', 1),\n", | |
" ('issues', 2),\n", | |
" ('issues present', 1),\n", | |
" ('issues present case', 1),\n", | |
" ('issues submitted', 1),\n", | |
" ('issues submitted jury', 1),\n", | |
" ('j', 1),\n", | |
" ('j c', 1),\n", | |
" ('j c powers', 1),\n", | |
" ('james', 1),\n", | |
" ('james f', 1),\n", | |
" ('james f mcelroy', 1),\n", | |
" ('judgment', 3),\n", | |
" ('judgment circuit', 1),\n", | |
" ('judgment circuit court', 1),\n", | |
" ('judgment entered', 2),\n", | |
" ('judgment entered defendant', 1),\n", | |
" ('judgment entered favor', 1),\n", | |
" ('jury', 6),\n", | |
" ('jury case', 1),\n", | |
" ('jury case remanded', 1),\n", | |
" ('jury held', 1),\n", | |
" ('jury held court', 1),\n", | |
" ('jury return', 1),\n", | |
" ('jury return verdict', 1),\n", | |
" ('jury verdict', 3),\n", | |
" ('jury verdict defendant', 1),\n", | |
" ('jury verdict rendered', 1),\n", | |
" ('jury verdict returned', 1),\n", | |
" ('knowledge', 4),\n", | |
" ('knowledge calhoun', 1),\n", | |
" ('knowledge calhoun excessive', 1),\n", | |
" ('knowledge concerning', 1),\n", | |
" ('knowledge concerning mortgage', 1),\n", | |
" ('knowledge defendant', 1),\n", | |
" ('knowledge defendant trial', 1),\n", | |
" ('knowledge existence', 1),\n", | |
" ('knowledge existence chattel', 1),\n", | |
" ('law', 1),\n", | |
" ('law policy', 1),\n", | |
" ('law policy void', 1),\n", | |
" ('leading', 1),\n", | |
" ('leading issuance', 1),\n", | |
" ('leading issuance policies', 1),\n", | |
" ('legal', 1),\n", | |
" ('legal aspect', 1),\n", | |
" ('legal aspect cases', 1),\n", | |
" ('loss', 2),\n", | |
" ('loss damage', 2),\n", | |
" ('loss damage case', 1),\n", | |
" ('loss damage extent', 1),\n", | |
" ('material', 1),\n", | |
" ('material facts', 1),\n", | |
" ('material facts report', 1),\n", | |
" ('matter', 1),\n", | |
" ('matter law', 1),\n", | |
" ('matter law policy', 1),\n", | |
" ('mcelroy', 3),\n", | |
" ('mcelroy demand', 1),\n", | |
" ('mcelroy demand motion', 1),\n", | |
" ('mcelroy superior', 1),\n", | |
" ('mcelroy superior court', 1),\n", | |
" ('mcelroy v', 1),\n", | |
" ('mcelroy v british', 1),\n", | |
" (\"mcelroy's\", 1),\n", | |
" (\"mcelroy's assignor\", 1),\n", | |
" (\"mcelroy's assignor palatine\", 1),\n", | |
" ('mortgage', 5),\n", | |
" ('mortgage calhoun', 1),\n", | |
" ('mortgage calhoun endeavored', 1),\n", | |
" ('mortgage excess', 1),\n", | |
" ('mortgage excess concurrent', 1),\n", | |
" ('mortgage insurance', 1),\n", | |
" ('mortgage insurance obtained', 1),\n", | |
" ('mortgage intention', 1),\n", | |
" ('mortgage intention insured', 1),\n", | |
" ('mortgage provided', 1),\n", | |
" ('mortgage provided agreement', 1),\n", | |
" ('motion', 1),\n", | |
" ('motion new', 1),\n", | |
" ('motion new trial', 1),\n", | |
" ('mrs', 1),\n", | |
" ('mrs j', 1),\n", | |
" ('mrs j c', 1),\n", | |
" ('negotiated', 1),\n", | |
" ('negotiated insurance', 1),\n", | |
" ('negotiated insurance sum', 1),\n", | |
" ('negotiations', 1),\n", | |
" ('negotiations leading', 1),\n", | |
" ('negotiations leading issuance', 1),\n", | |
" ('new', 2),\n", | |
" ('new trial', 2),\n", | |
" ('new trial case', 1),\n", | |
" ('new trial denied', 1),\n", | |
" ('notice', 5),\n", | |
" ('notice company', 1),\n", | |
" ('notice company acts', 1),\n", | |
" ('notice considered', 1),\n", | |
" ('notice considered notice', 1),\n", | |
" ('notice knowledge', 3),\n", | |
" ('notice knowledge concerning', 1),\n", | |
" ('notice knowledge defendant', 1),\n", | |
" ('notice knowledge existence', 1),\n", | |
" ('noticed', 1),\n", | |
" ('noticed legal', 1),\n", | |
" ('noticed legal aspect', 1),\n", | |
" ('obtained', 1),\n", | |
" ('obtained excess', 1),\n", | |
" ('obtained excess concurrent', 1),\n", | |
" ('palatine', 2),\n", | |
" ('palatine insurance', 2),\n", | |
" ('palatine insurance company', 2),\n", | |
" ('parties', 1),\n", | |
" ('parties case', 1),\n", | |
" ('parties case detailed', 1),\n", | |
" ('passed', 1),\n", | |
" ('passed case', 1),\n", | |
" ('passed case accordance', 1),\n", | |
" ('peremptory', 1),\n", | |
" ('peremptory instruction', 1),\n", | |
" ('peremptory instruction question', 1),\n", | |
" ('petition', 1),\n", | |
" ('petition defendant', 1),\n", | |
" ('petition defendant insurance', 1),\n", | |
" ('placed', 2),\n", | |
" ('placed agent', 1),\n", | |
" ('placed agent british', 1),\n", | |
" ('placed evidence', 1),\n", | |
" ('placed evidence calhoun', 1),\n", | |
" ('plaintiff', 8),\n", | |
" ('plaintiff action', 1),\n", | |
" ('plaintiff action brought', 1),\n", | |
" ('plaintiff calhoun', 1),\n", | |
" ('plaintiff calhoun agents', 1),\n", | |
" ('plaintiff defendant', 1),\n", | |
" ('plaintiff defendant error', 1),\n", | |
" ('plaintiff error', 1),\n", | |
" ('plaintiff error evidence', 1),\n", | |
" ('plaintiff established', 1),\n", | |
" ('plaintiff established evidence', 1),\n", | |
" ('plaintiff james', 1),\n", | |
" ('plaintiff james f', 1),\n", | |
" ('plaintiff reverse', 1),\n", | |
" ('plaintiff reverse defendant', 1),\n", | |
" ('plaintiff right', 1),\n", | |
" ('plaintiff right case', 1),\n", | |
" ('policies', 3),\n", | |
" ('policies agent', 1),\n", | |
" ('policies agent insured', 1),\n", | |
" ('policies amounts', 1),\n", | |
" ('policies amounts stated', 1),\n", | |
" ('policies parties', 1),\n", | |
" ('policies parties case', 1),\n", | |
" ('policy', 10),\n", | |
" ('policy account', 1),\n", | |
" ('policy account conditions', 1),\n", | |
" ('policy agent', 1),\n", | |
" ('policy agent insured', 1),\n", | |
" ('policy brought', 1),\n", | |
" ('policy brought plaintiff', 1),\n", | |
" ('policy case', 1),\n", | |
" ('policy case agent', 1),\n", | |
" ('policy ground', 1),\n", | |
" ('policy ground defense', 1),\n", | |
" ('policy insurance', 1),\n", | |
" ('policy insurance issued', 1),\n", | |
" ('policy issued', 1),\n", | |
" ('policy issued defendant', 1),\n", | |
" ('policy reason', 1),\n", | |
" ('policy reason alleged', 1),\n", | |
" ('policy void', 2),\n", | |
" ('policy void forbidden', 1),\n", | |
" ('policy void reason', 1),\n", | |
" ('portion', 1),\n", | |
" ('portion premium', 1),\n", | |
" ('portion premium evidence', 1),\n", | |
" ('powers', 1),\n", | |
" (\"powers mcelroy's\", 1),\n", | |
" (\"powers mcelroy's assignor\", 1),\n", | |
" ('practically', 1),\n", | |
" ('practically record', 1),\n", | |
" ('practically record distinction', 1),\n", | |
" ('premium', 1),\n", | |
" ('premium evidence', 1),\n", | |
" ('premium evidence facts', 1),\n", | |
" ('present', 1),\n", | |
" ('present case', 1),\n", | |
" ('present case fact', 1),\n", | |
" ('presented', 1),\n", | |
" ('presented testimony', 1),\n", | |
" ('presented testimony introduced', 1),\n", | |
" ('property', 2),\n", | |
" ('property incumbered', 2),\n", | |
" ('property incumbered chattel', 2),\n", | |
" ('provided', 2),\n", | |
" ('provided agreement', 1),\n", | |
" ('provided agreement indorsed', 1),\n", | |
" ('provided property', 1),\n", | |
" ('provided property incumbered', 1),\n", | |
" ('question', 4),\n", | |
" ('question agency', 1),\n", | |
" ('question agency involve', 1),\n", | |
" ('question considered', 1),\n", | |
" ('question considered determined', 1),\n", | |
" ('question evidence', 1),\n", | |
" ('question evidence case', 1),\n", | |
" ('question presented', 1),\n", | |
" ('question presented testimony', 1),\n", | |
" ('questions', 1),\n", | |
" ('questions fully', 1),\n", | |
" ('questions fully discussed', 1),\n", | |
" ('reason', 2),\n", | |
" ('reason alleged', 1),\n", | |
" ('reason alleged breach', 1),\n", | |
" ('reason express', 1),\n", | |
" ('reason express terms', 1),\n", | |
" ('received', 1),\n", | |
" ('received written', 1),\n", | |
" ('received written policies', 1),\n", | |
" ('record', 1),\n", | |
" ('record distinction', 1),\n", | |
" ('record distinction noticed', 1),\n", | |
" ('recover', 2),\n", | |
" ('recover policy', 1),\n", | |
" ('recover policy issued', 1),\n", | |
" ('recover sum', 1),\n", | |
" ('recover sum alleged', 1),\n", | |
" ('refusal', 1),\n", | |
" ('refusal court', 1),\n", | |
" ('refusal court instruct', 1),\n", | |
" ('refused', 1),\n", | |
" ('refused court', 1),\n", | |
" ('refused court concerning', 1),\n", | |
" ('relate', 1),\n", | |
" ('relate instructions', 1),\n", | |
" ('relate instructions given', 1),\n", | |
" ('relates', 1),\n", | |
" ('relates refusal', 1),\n", | |
" ('relates refusal court', 1),\n", | |
" ('remaining', 2),\n", | |
" ('remaining assignments', 1),\n", | |
" ('remaining assignments error', 1),\n", | |
" ('remaining issues', 1),\n", | |
" ('remaining issues submitted', 1),\n", | |
" ('remanded', 1),\n", | |
" ('remanded instructions', 1),\n", | |
" ('remanded instructions grant', 1),\n", | |
" ('removed', 1),\n", | |
" ('removed circuit', 1),\n", | |
" ('removed circuit court', 1),\n", | |
" ('rendered', 1),\n", | |
" ('rendered plaintiff', 1),\n", | |
" ('rendered plaintiff defendant', 1),\n", | |
" ('replied', 1),\n", | |
" ('replied calhoun', 1),\n", | |
" ('replied calhoun agents', 1),\n", | |
" ('report', 1),\n", | |
" ('report british', 1),\n", | |
" ('report british america', 1),\n", | |
" ('reported', 1),\n", | |
" ('reported c', 1),\n", | |
" ('reported c c', 1),\n", | |
" ('requires', 1),\n", | |
" ('requires discussion', 1),\n", | |
" ('requires discussion remaining', 1),\n", | |
" ('return', 1),\n", | |
" ('return verdict', 1),\n", | |
" ('return verdict insurance', 1),\n", | |
" ('returned', 1),\n", | |
" ('returned favor', 1),\n", | |
" ('returned favor mcelroy', 1),\n", | |
" ('reverse', 1),\n", | |
" ('reverse defendant', 1),\n", | |
" ('reverse defendant sued', 1),\n", | |
" ('right', 1),\n", | |
" ('right case', 1),\n", | |
" ('right case submitted', 1),\n", | |
" ('risk', 1),\n", | |
" ('risk time', 1),\n", | |
" ('risk time substantially', 1),\n", | |
" ('seattle', 1),\n", | |
" ('seattle negotiated', 1),\n", | |
" ('seattle negotiated insurance', 1),\n", | |
" ('secure', 2),\n", | |
" ('secure insurance', 2),\n", | |
" ('secure insurance cover', 1),\n", | |
" ('secure insurance placed', 1),\n", | |
" ('secured', 1),\n", | |
" ('secured application', 1),\n", | |
" ('secured application contract', 1),\n", | |
" ('securing', 2),\n", | |
" ('securing contract', 2),\n", | |
" ('securing contract insurance', 2),\n", | |
" ('single', 1),\n", | |
" ('single assignment', 1),\n", | |
" ('single assignment error', 1),\n", | |
" ('state', 1),\n", | |
" ('state washington', 1),\n", | |
" ('state washington recover', 1),\n", | |
" ('stated', 1),\n", | |
" ('stated agents', 1),\n", | |
" ('stated agents companies', 1),\n", | |
" ('statement', 1),\n", | |
" ('statement material', 1),\n", | |
" ('statement material facts', 1),\n", | |
" ('states', 1),\n", | |
" ('states district', 1),\n", | |
" ('states district washington', 1),\n", | |
" ('steamer', 1),\n", | |
" ('steamer cricket', 1),\n", | |
" ('steamer cricket hull', 1),\n", | |
" ('submitted', 2),\n", | |
" ('submitted jury', 2),\n", | |
" ('submitted jury held', 1),\n", | |
" ('submitted jury verdict', 1),\n", | |
" ('substantially', 2),\n", | |
" ('substantially circumstances', 1),\n", | |
" ('substantially circumstances insurance', 1),\n", | |
" ('substantially evidence', 1),\n", | |
" ('substantially evidence british', 1),\n", | |
" ('sued', 1),\n", | |
" ('sued writ', 1),\n", | |
" ('sued writ error', 1),\n", | |
" ('sufficient', 1),\n", | |
" ('sufficient jury', 1),\n", | |
" ('sufficient jury case', 1),\n", | |
" ('sum', 3),\n", | |
" ('sum alleged', 1),\n", | |
" ('sum alleged policy', 1),\n", | |
" ('sum notice', 1),\n", | |
" ('sum notice knowledge', 1),\n", | |
" ('sum placed', 1),\n", | |
" ('sum placed agent', 1),\n", | |
" ('superior', 1),\n", | |
" ('superior court', 1),\n", | |
" ('superior court state', 1),\n", | |
" ('supra', 1),\n", | |
" ('supra cases', 1),\n", | |
" ('supra cases come', 1),\n", | |
" ('tackle', 1),\n", | |
" ('tackle furniture', 1),\n", | |
" ('tackle furniture loss', 1),\n", | |
" ('tending', 1),\n", | |
" ('tending calhoun', 1),\n", | |
" ('tending calhoun firm', 1),\n", | |
" ('terms', 1),\n", | |
" ('terms provided', 1),\n", | |
" ('terms provided property', 1),\n", | |
" ('testimony', 2),\n", | |
" ('testimony instructed', 1),\n", | |
" ('testimony instructed jury', 1),\n", | |
" ('testimony introduced', 1),\n", | |
" ('testimony introduced plaintiff', 1),\n", | |
" ('time', 2),\n", | |
" ('time collected', 1),\n", | |
" ('time collected portion', 1),\n", | |
" ('time substantially', 1),\n", | |
" ('time substantially circumstances', 1),\n", | |
" ('transaction', 1),\n", | |
" ('transaction dealt', 1),\n", | |
" ('transaction dealt agent', 1),\n", | |
" ('trial', 4),\n", | |
" ('trial case', 1),\n", | |
" ('trial case bar', 1),\n", | |
" ('trial court', 2),\n", | |
" ('trial court close', 1),\n", | |
" ('trial court issues', 1),\n", | |
" ('trial denied', 1),\n", | |
" ('trial denied judgment', 1),\n", | |
" ('tried', 1),\n", | |
" ('tried jury', 1),\n", | |
" ('tried jury verdict', 1),\n", | |
" ('united', 1),\n", | |
" ('united states', 1),\n", | |
" ('united states district', 1),\n", | |
" ('v', 1),\n", | |
" ('v british', 1),\n", | |
" ('v british america', 1),\n", | |
" ('validity', 2),\n", | |
" ('validity policy', 2),\n", | |
" ('validity policy account', 1),\n", | |
" ('validity policy reason', 1),\n", | |
" ('verdict', 4),\n", | |
" ('verdict defendant', 1),\n", | |
" ('verdict defendant plaintiff', 1),\n", | |
" ('verdict insurance', 1),\n", | |
" ('verdict insurance company', 1),\n", | |
" ('verdict rendered', 1),\n", | |
" ...])" | |
] | |
} | |
], | |
"prompt_number": 156 | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": {}, | |
"source": [ | |
"Extract the unique vocabulary -- this uses the ``nmin`` and ``nmax`` settings if they are used in the initialization of the ``SparkBloombergCaseVectorizer`` instance:" | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"collapsed": false, | |
"input": [ | |
"cv.vocab_rdd.cache()" | |
], | |
"language": "python", | |
"metadata": {}, | |
"outputs": [ | |
{ | |
"metadata": {}, | |
"output_type": "pyout", | |
"prompt_number": 157, | |
"text": [ | |
"PythonRDD[367] at RDD at PythonRDD.scala:37" | |
] | |
} | |
], | |
"prompt_number": 157 | |
}, | |
{ | |
"cell_type": "code", | |
"collapsed": false, | |
"input": [ | |
"%time cv.vocab_rdd.count()" | |
], | |
"language": "python", | |
"metadata": {}, | |
"outputs": [ | |
{ | |
"output_type": "stream", | |
"stream": "stdout", | |
"text": [ | |
"CPU times: user 6.76 ms, sys: 2.36 ms, total: 9.13 ms\n", | |
"Wall time: 2.37 s\n" | |
] | |
}, | |
{ | |
"metadata": {}, | |
"output_type": "pyout", | |
"prompt_number": 158, | |
"text": [ | |
"945483" | |
] | |
} | |
], | |
"prompt_number": 158 | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": {}, | |
"source": [ | |
"The \"vocabulary\" is a collection of unique (filtered) ngrams:" | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"collapsed": false, | |
"input": [ | |
"cv.vocab_rdd.take(10)" | |
], | |
"language": "python", | |
"metadata": {}, | |
"outputs": [ | |
{ | |
"metadata": {}, | |
"output_type": "pyout", | |
"prompt_number": 159, | |
"text": [ | |
"['ab',\n", | |
" 'ab initio',\n", | |
" 'ab initio notwithstanding',\n", | |
" 'ab initio prevent',\n", | |
" 'ab initio voidable',\n", | |
" 'aback',\n", | |
" 'aback course',\n", | |
" 'aback course swung',\n", | |
" 'aback wind',\n", | |
" 'aback wind light']" | |
] | |
} | |
], | |
"prompt_number": 159 | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": {}, | |
"source": [ | |
"We can extract specific ones if we want : (note that something this could easily be added as a convenience function to ``SparkBloombergCaseVectorizer``)" | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"collapsed": false, | |
"input": [ | |
"cv.vocab_rdd.filter(lambda x: 'evidence' in x).takeSample(False, 10)" | |
], | |
"language": "python", | |
"metadata": {}, | |
"outputs": [ | |
{ | |
"metadata": {}, | |
"output_type": "pyout", | |
"prompt_number": 160, | |
"text": [ | |
"['total lack evidence',\n", | |
" 'evidence shows use',\n", | |
" 'tendency evidence',\n", | |
" 'evidence general allegations',\n", | |
" 'certain oral evidence',\n", | |
" 'evidence objection',\n", | |
" 'evidence result thoughtless',\n", | |
" 'evidence master choctaw',\n", | |
" 'evidence fully establishes',\n", | |
" 'circumstantial evidence']" | |
] | |
} | |
], | |
"prompt_number": 160 | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": {}, | |
"source": [ | |
"You can get a mapping from words to indices that can be used later to lookup features: " | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"collapsed": false, | |
"input": [ | |
"cv.get_vocab_map().take(10)" | |
], | |
"language": "python", | |
"metadata": {}, | |
"outputs": [ | |
{ | |
"metadata": {}, | |
"output_type": "pyout", | |
"prompt_number": 161, | |
"text": [ | |
"[('ab', 1038175),\n", | |
" ('ab initio', 69239),\n", | |
" ('ab initio notwithstanding', 33049),\n", | |
" ('ab initio prevent', 362316),\n", | |
" ('ab initio voidable', 736135),\n", | |
" ('aback', 72242),\n", | |
" ('aback course', 764017),\n", | |
" ('aback course swung', 235052),\n", | |
" ('aback wind', 833153),\n", | |
" ('aback wind light', 843844)]" | |
] | |
} | |
], | |
"prompt_number": 161 | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": {}, | |
"source": [ | |
"If we just want the feature vectors for passing data into another algorithm, we can get them with the ``docvec_rdd``." | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": {}, | |
"source": [ | |
"Here, we convert the feature counts into a [``SparseVector``](http://spark.apache.org/docs/latest/api/python/pyspark.mllib.linalg.SparseVector-class.html) (maybe not the best choice -- no ``add`` method -- could use ``scipy.sparse`` types instead)" | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"collapsed": false, | |
"input": [ | |
"sparkgram.document_vectorizer.next_power_of_two(cv.vocab_rdd.count())" | |
], | |
"language": "python", | |
"metadata": {}, | |
"outputs": [ | |
{ | |
"metadata": {}, | |
"output_type": "pyout", | |
"prompt_number": 133, | |
"text": [ | |
"131072" | |
] | |
} | |
], | |
"prompt_number": 133 | |
}, | |
{ | |
"cell_type": "code", | |
"collapsed": false, | |
"input": [ | |
"cv.docvec_rdd.first()" | |
], | |
"language": "python", | |
"metadata": {}, | |
"outputs": [ | |
{ | |
"metadata": {}, | |
"output_type": "pyout", | |
"prompt_number": 134, | |
"text": [ | |
"('XFLMS2||01oct1900||Criminal Law||1900||9||GILBERT, WILLIAM B.||MORROW, WILLIAM W.||ROSS, ERSKINE M.||GILBERT, WILLIAM B.||GILBERT||0||0||1',\n", | |
" SparseVector(131072, {179: 1.0, 261: 1.0, 262: 1.0, 336: 1.0, 569: 1.0, 603: 1.0, 749: 1.0, 921: 1.0, 943: 2.0, 1019: 1.0, 1061: 1.0, 1513: 1.0, 1516: 1.0, 1554: 1.0, 1563: 1.0, 1626: 1.0, 1639: 1.0, 1725: 1.0, 1874: 1.0, 2040: 1.0, 2045: 1.0, 2092: 1.0, 2111: 2.0, 2127: 1.0, 2195: 1.0, 2285: 1.0, 2518: 3.0, 2560: 1.0, 2613: 2.0, 2630: 1.0, 2875: 1.0, 2897: 1.0, 2939: 1.0, 3092: 1.0, 3128: 1.0, 3149: 1.0, 3156: 1.0, 3312: 1.0, 3330: 1.0, 3346: 1.0, 3394: 3.0, 3417: 1.0, 3421: 1.0, 3818: 1.0, 3935: 1.0, 3937: 1.0, 3998: 20.0, 4286: 1.0, 4320: 1.0, 4618: 1.0, 4668: 1.0, 4670: 1.0, 4696: 2.0, 4738: 1.0, 4822: 1.0, 4877: 1.0, 4889: 1.0, 4937: 1.0, 4960: 1.0, 5020: 1.0, 5028: 1.0, 5034: 1.0, 5054: 1.0, 5135: 1.0, 5225: 1.0, 5507: 2.0, 5579: 3.0, 5712: 1.0, 5826: 2.0, 5933: 1.0, 5984: 1.0, 5987: 1.0, 6151: 1.0, 6171: 1.0, 6195: 1.0, 6233: 2.0, 6417: 3.0, 6465: 1.0, 6468: 4.0, 6533: 1.0, 6674: 3.0, 6701: 1.0, 6756: 1.0, 7130: 1.0, 7392: 1.0, 7599: 1.0, 7639: 1.0, 7646: 1.0, 7699: 1.0, 7746: 1.0, 7770: 1.0, 7853: 1.0, 7866: 1.0, 7951: 1.0, 8092: 1.0, 8276: 1.0, 8488: 1.0, 8584: 1.0, 8598: 1.0, 8600: 1.0, 8696: 2.0, 8748: 1.0, 8783: 1.0, 9027: 1.0, 9343: 1.0, 9443: 1.0, 9477: 1.0, 9790: 1.0, 9801: 1.0, 9875: 1.0, 10071: 4.0, 10109: 1.0, 10258: 1.0, 10259: 1.0, 10325: 1.0, 10361: 4.0, 10447: 1.0, 10635: 1.0, 10646: 1.0, 10676: 1.0, 10682: 1.0, 10794: 2.0, 10878: 1.0, 11100: 1.0, 11116: 1.0, 11227: 1.0, 11331: 1.0, 11552: 1.0, 11564: 1.0, 11629: 1.0, 11752: 1.0, 11763: 1.0, 11957: 1.0, 12046: 1.0, 12097: 1.0, 12171: 1.0, 12315: 2.0, 12319: 1.0, 12474: 1.0, 12482: 1.0, 12600: 1.0, 12695: 1.0, 12716: 1.0, 12781: 3.0, 13006: 1.0, 13111: 1.0, 13158: 1.0, 13203: 1.0, 13311: 1.0, 13447: 1.0, 13700: 1.0, 13701: 7.0, 13888: 1.0, 13930: 1.0, 14092: 1.0, 14141: 1.0, 14141: 1.0, 14156: 1.0, 14213: 1.0, 14456: 8.0, 14500: 3.0, 14693: 1.0, 15042: 2.0, 15072: 6.0, 15134: 2.0, 15320: 1.0, 15465: 1.0, 15630: 1.0, 15638: 1.0, 15803: 2.0, 15831: 1.0, 16225: 1.0, 16457: 1.0, 16504: 1.0, 16570: 1.0, 16749: 1.0, 16769: 2.0, 16815: 2.0, 17096: 1.0, 17225: 1.0, 17272: 1.0, 17453: 1.0, 17455: 1.0, 17472: 1.0, 17474: 1.0, 17538: 1.0, 17593: 2.0, 17657: 1.0, 18018: 2.0, 18054: 2.0, 18099: 1.0, 18261: 1.0, 18272: 1.0, 18318: 1.0, 18359: 2.0, 18373: 2.0, 18497: 3.0, 18634: 1.0, 19015: 2.0, 19336: 1.0, 19456: 1.0, 19492: 1.0, 19592: 1.0, 19592: 1.0, 19653: 1.0, 19777: 2.0, 19970: 4.0, 20066: 1.0, 20091: 1.0, 20131: 1.0, 20206: 1.0, 20206: 1.0, 20266: 1.0, 20278: 3.0, 20368: 1.0, 20506: 1.0, 20622: 1.0, 20623: 1.0, 20774: 1.0, 20867: 2.0, 20914: 1.0, 20933: 1.0, 20989: 1.0, 21004: 1.0, 21006: 1.0, 21047: 1.0, 21108: 1.0, 21198: 1.0, 21253: 1.0, 21283: 1.0, 21427: 1.0, 21433: 1.0, 21478: 2.0, 21662: 1.0, 21745: 1.0, 21776: 1.0, 21856: 1.0, 21908: 1.0, 22076: 1.0, 22156: 1.0, 22250: 1.0, 22263: 1.0, 22460: 1.0, 22518: 2.0, 22594: 1.0, 22646: 4.0, 23202: 1.0, 23270: 1.0, 23295: 2.0, 23353: 4.0, 23370: 1.0, 23592: 2.0, 23659: 1.0, 23829: 1.0, 23882: 1.0, 24115: 2.0, 24149: 1.0, 24158: 1.0, 24211: 1.0, 24257: 1.0, 24398: 1.0, 24509: 1.0, 24514: 1.0, 24662: 1.0, 24678: 1.0, 24776: 1.0, 24786: 1.0, 24913: 1.0, 24945: 1.0, 25433: 1.0, 25532: 1.0, 25849: 1.0, 25918: 1.0, 25919: 1.0, 26078: 1.0, 26080: 1.0, 26387: 1.0, 26550: 1.0, 26661: 1.0, 26690: 1.0, 26833: 1.0, 26912: 1.0, 26958: 3.0, 27000: 1.0, 27095: 1.0, 27111: 2.0, 27137: 1.0, 27191: 1.0, 27239: 1.0, 27254: 1.0, 27426: 1.0, 27541: 1.0, 27620: 1.0, 27631: 1.0, 27744: 1.0, 27857: 1.0, 27998: 1.0, 28003: 1.0, 28297: 1.0, 28315: 1.0, 28458: 1.0, 28648: 3.0, 28690: 1.0, 28794: 1.0, 28822: 1.0, 28840: 1.0, 29162: 5.0, 29241: 1.0, 29389: 1.0, 29551: 1.0, 29599: 1.0, 29694: 1.0, 29701: 1.0, 29961: 1.0, 30035: 1.0, 30304: 1.0, 30308: 1.0, 30500: 1.0, 30693: 1.0, 30778: 2.0, 30828: 1.0, 30912: 1.0, 31150: 1.0, 31165: 1.0, 31175: 1.0, 31246: 1.0, 31279: 1.0, 31327: 1.0, 31418: 2.0, 31457: 1.0, 31547: 1.0, 31674: 1.0, 31867: 1.0, 31882: 2.0, 31910: 1.0, 32014: 1.0, 32035: 1.0, 32089: 1.0, 32169: 1.0, 32172: 1.0, 32284: 1.0, 32611: 1.0, 32662: 1.0, 32694: 1.0, 32775: 1.0, 32953: 1.0, 33029: 1.0, 33176: 1.0, 33184: 1.0, 33195: 1.0, 33490: 1.0, 33525: 1.0, 33618: 1.0, 33773: 1.0, 33893: 1.0, 33955: 1.0, 34119: 1.0, 34198: 1.0, 34246: 7.0, 34289: 1.0, 34374: 2.0, 34492: 1.0, 34514: 1.0, 34590: 1.0, 34590: 1.0, 34800: 1.0, 34839: 1.0, 34845: 1.0, 34916: 1.0, 34965: 1.0, 34998: 4.0, 35069: 1.0, 35440: 1.0, 35475: 1.0, 35508: 2.0, 35670: 2.0, 35676: 1.0, 35928: 1.0, 35981: 1.0, 36043: 1.0, 36224: 1.0, 36331: 1.0, 36512: 1.0, 36533: 1.0, 36779: 1.0, 36999: 1.0, 37082: 1.0, 37808: 1.0, 38030: 1.0, 38042: 1.0, 38420: 1.0, 38457: 2.0, 38519: 1.0, 38552: 1.0, 38578: 2.0, 38726: 1.0, 39074: 1.0, 39077: 1.0, 39177: 2.0, 39203: 1.0, 39228: 1.0, 39491: 1.0, 39671: 3.0, 39795: 1.0, 39956: 1.0, 40105: 1.0, 40112: 1.0, 40253: 1.0, 40264: 1.0, 40349: 1.0, 40456: 1.0, 40502: 1.0, 40544: 1.0, 40755: 1.0, 40788: 1.0, 41125: 1.0, 41176: 1.0, 41238: 1.0, 41354: 1.0, 41370: 1.0, 41392: 1.0, 41455: 1.0, 41648: 7.0, 41654: 1.0, 41761: 3.0, 41943: 1.0, 41970: 1.0, 42073: 2.0, 42337: 1.0, 42495: 4.0, 42498: 1.0, 42797: 1.0, 42848: 1.0, 43100: 1.0, 43105: 1.0, 43235: 1.0, 43339: 3.0, 43397: 3.0, 43420: 1.0, 43429: 8.0, 43546: 1.0, 43714: 1.0, 43934: 1.0, 44167: 1.0, 44303: 1.0, 44541: 1.0, 44725: 1.0, 44780: 1.0, 44928: 7.0, 45101: 1.0, 45160: 2.0, 45231: 1.0, 45245: 2.0, 45583: 1.0, 45692: 1.0, 45722: 1.0, 45830: 1.0, 45831: 1.0, 45874: 1.0, 45907: 1.0, 45936: 1.0, 45939: 1.0, 45948: 6.0, 46038: 1.0, 46104: 1.0, 46264: 2.0, 46292: 1.0, 46353: 1.0, 46451: 1.0, 46713: 1.0, 46722: 1.0, 46836: 2.0, 46953: 1.0, 46985: 1.0, 47002: 1.0, 47003: 1.0, 47030: 1.0, 47267: 1.0, 47314: 1.0, 47483: 2.0, 47497: 1.0, 47887: 1.0, 47907: 1.0, 48192: 1.0, 48216: 1.0, 48299: 1.0, 48444: 2.0, 48497: 2.0, 48530: 1.0, 48662: 1.0, 48755: 1.0, 48772: 1.0, 48778: 1.0, 48839: 1.0, 49005: 1.0, 49026: 1.0, 49183: 1.0, 49211: 1.0, 49446: 1.0, 49511: 1.0, 49537: 2.0, 49538: 1.0, 49604: 2.0, 49884: 2.0, 49893: 2.0, 50095: 2.0, 50180: 1.0, 50266: 1.0, 50277: 1.0, 50712: 1.0, 50742: 1.0, 50768: 1.0, 50798: 1.0, 50892: 1.0, 51005: 1.0, 51019: 1.0, 51024: 1.0, 51027: 1.0, 51085: 1.0, 51113: 1.0, 51360: 1.0, 51676: 3.0, 51953: 7.0, 51987: 3.0, 52105: 1.0, 52127: 1.0, 52237: 2.0, 52273: 1.0, 52291: 1.0, 52299: 1.0, 52353: 1.0, 52367: 1.0, 52516: 1.0, 52548: 1.0, 52586: 1.0, 52940: 6.0, 52968: 1.0, 53050: 2.0, 53078: 1.0, 53110: 1.0, 53285: 1.0, 53303: 1.0, 53420: 2.0, 53720: 3.0, 53794: 1.0, 53897: 1.0, 53922: 2.0, 53926: 1.0, 54344: 1.0, 54390: 1.0, 54432: 2.0, 54456: 1.0, 54536: 1.0, 54710: 1.0, 54879: 1.0, 54880: 1.0, 54898: 1.0, 55091: 1.0, 55195: 1.0, 55253: 1.0, 55330: 1.0, 55392: 1.0, 55449: 1.0, 55500: 1.0, 55630: 1.0, 55632: 1.0, 55668: 1.0, 55823: 1.0, 55944: 3.0, 56001: 1.0, 56008: 1.0, 56035: 1.0, 56050: 1.0, 56673: 1.0, 56720: 1.0, 56753: 1.0, 57013: 1.0, 57071: 2.0, 57117: 1.0, 57134: 1.0, 57212: 1.0, 57212: 4.0, 57235: 5.0, 57269: 1.0, 57296: 1.0, 57438: 1.0, 57452: 1.0, 57658: 1.0, 57777: 1.0, 57977: 1.0, 58271: 4.0, 58315: 2.0, 58616: 2.0, 58994: 1.0, 59021: 1.0, 59174: 1.0, 59243: 1.0, 59332: 1.0, 59377: 1.0, 59460: 1.0, 59496: 2.0, 59504: 1.0, 59815: 1.0, 59999: 5.0, 60055: 1.0, 60146: 1.0, 60296: 1.0, 60357: 3.0, 60380: 2.0, 60456: 1.0, 60459: 1.0, 60497: 1.0, 60687: 1.0, 60733: 1.0, 60888: 1.0, 60922: 1.0, 61009: 1.0, 61051: 1.0, 61106: 1.0, 61107: 1.0, 61114: 1.0, 61269: 2.0, 61323: 1.0, 61412: 1.0, 61440: 1.0, 61552: 1.0, 61699: 1.0, 61783: 1.0, 61915: 2.0, 62009: 1.0, 62050: 1.0, 62276: 1.0, 62331: 2.0, 62373: 1.0, 62395: 1.0, 62492: 1.0, 62632: 1.0, 62662: 1.0, 62907: 1.0, 63018: 1.0, 63065: 1.0, 63159: 1.0, 63236: 1.0, 63244: 1.0, 63444: 1.0, 63551: 1.0, 63657: 1.0, 63685: 1.0, 63815: 1.0, 63876: 1.0, 63938: 1.0, 64040: 1.0, 64073: 1.0, 64161: 1.0, 64487: 1.0, 64566: 1.0, 64885: 1.0, 65184: 1.0, 65208: 2.0, 65412: 1.0, 65453: 6.0, 65456: 1.0, 65582: 1.0, 65880: 1.0, 66007: 1.0, 66222: 3.0, 66317: 1.0, 66473: 1.0, 66513: 1.0, 66542: 1.0, 66650: 1.0, 66668: 1.0, 66843: 1.0, 67126: 1.0, 67634: 1.0, 67833: 1.0, 67835: 1.0, 67865: 1.0, 67888: 1.0, 68077: 1.0, 68128: 1.0, 68132: 1.0, 68140: 1.0, 68285: 1.0, 68410: 1.0, 68432: 1.0, 68436: 5.0, 68459: 1.0, 68599: 1.0, 68601: 1.0, 68831: 1.0, 68883: 1.0, 68896: 1.0, 68993: 1.0, 69152: 1.0, 69219: 1.0, 69279: 1.0, 69380: 1.0, 69408: 1.0, 69466: 5.0, 69616: 1.0, 69667: 1.0, 69698: 2.0, 69901: 1.0, 69930: 1.0, 70032: 1.0, 70108: 1.0, 70201: 1.0, 70208: 1.0, 70265: 1.0, 70469: 1.0, 70539: 2.0, 70782: 1.0, 70821: 1.0, 70992: 1.0, 71049: 1.0, 71155: 1.0, 71225: 1.0, 71228: 1.0, 71313: 6.0, 71343: 1.0, 71389: 1.0, 71463: 1.0, 71481: 1.0, 71524: 2.0, 71595: 2.0, 71707: 1.0, 71797: 1.0, 71829: 1.0, 71838: 12.0, 71936: 1.0, 72050: 1.0, 72074: 1.0, 72095: 2.0, 72159: 1.0, 72477: 1.0, 72710: 1.0, 72728: 1.0, 72882: 1.0, 72949: 1.0, 72957: 2.0, 72966: 1.0, 73434: 1.0, 73572: 1.0, 73779: 1.0, 73884: 1.0, 73948: 1.0, 73974: 1.0, 74000: 1.0, 74141: 1.0, 74167: 1.0, 74329: 1.0, 74457: 1.0, 74535: 1.0, 74613: 1.0, 74667: 1.0, 74881: 1.0, 74965: 1.0, 75035: 1.0, 75036: 1.0, 75090: 1.0, 75102: 2.0, 75139: 2.0, 75319: 3.0, 75649: 1.0, 75806: 1.0, 75812: 1.0, 75854: 6.0, 76115: 1.0, 76185: 1.0, 76232: 1.0, 76341: 1.0, 76384: 1.0, 76410: 1.0, 76572: 1.0, 76584: 1.0, 76662: 1.0, 76711: 1.0, 76788: 1.0, 76861: 4.0, 76974: 2.0, 77048: 1.0, 77080: 1.0, 77104: 1.0, 77301: 1.0, 77374: 1.0, 77670: 1.0, 77682: 1.0, 77788: 1.0, 77831: 1.0, 77901: 1.0, 77975: 1.0, 78062: 1.0, 78264: 6.0, 78340: 1.0, 78728: 1.0, 78802: 1.0, 78803: 1.0, 79090: 2.0, 79155: 1.0, 79227: 1.0, 79260: 1.0, 79318: 1.0, 79324: 1.0, 79590: 1.0, 79639: 1.0, 79756: 1.0, 79758: 1.0, 79883: 1.0, 79893: 1.0, 80001: 2.0, 80068: 1.0, 80081: 1.0, 80096: 1.0, 80290: 1.0, 80370: 1.0, 80567: 1.0, 80670: 1.0, 80801: 1.0, 80872: 1.0, 80879: 1.0, 80933: 1.0, 80936: 1.0, 80945: 1.0, 81193: 1.0, 81212: 1.0, 81227: 1.0, 81509: 1.0, 81742: 1.0, 81742: 1.0, 81859: 2.0, 81869: 4.0, 81939: 4.0, 81972: 1.0, 81979: 1.0, 82268: 1.0, 82350: 20.0, 82364: 1.0, 82366: 1.0, 82566: 4.0, 82591: 1.0, 82631: 1.0, 82840: 1.0, 82889: 1.0, 82929: 1.0, 82982: 1.0, 83044: 2.0, 83110: 1.0, 83168: 2.0, 83316: 1.0, 83416: 1.0, 83512: 1.0, 83533: 1.0, 83543: 1.0, 83625: 1.0, 83652: 1.0, 83725: 4.0, 83833: 1.0, 84014: 1.0, 84043: 1.0, 84302: 1.0, 84410: 1.0, 84578: 1.0, 84667: 1.0, 84690: 1.0, 84966: 1.0, 84993: 2.0, 85095: 1.0, 85512: 1.0, 85517: 1.0, 85590: 1.0, 85719: 1.0, 86085: 1.0, 86106: 1.0, 86123: 5.0, 86198: 1.0, 86276: 1.0, 86342: 1.0, 86493: 1.0, 86642: 2.0, 86665: 1.0, 86823: 1.0, 87032: 1.0, 87106: 1.0, 87309: 1.0, 87408: 1.0, 87409: 1.0, 87414: 1.0, 87608: 1.0, 87753: 1.0, 87873: 7.0, 87902: 1.0, 88003: 1.0, 88038: 1.0, 88166: 1.0, 88194: 1.0, 88198: 1.0, 88222: 1.0, 88485: 1.0, 88583: 1.0, 88604: 1.0, 88609: 1.0, 88648: 1.0, 88814: 1.0, 88887: 5.0, 89109: 3.0, 89124: 1.0, 89295: 1.0, 89395: 1.0, 89452: 1.0, 89497: 3.0, 89748: 1.0, 89832: 1.0, 90131: 1.0, 90217: 1.0, 90273: 1.0, 90559: 1.0, 90904: 1.0, 90960: 1.0, 91106: 1.0, 91395: 1.0, 91434: 1.0, 91459: 1.0, 91489: 1.0, 91615: 1.0, 91632: 1.0, 91917: 1.0, 91924: 1.0, 91936: 1.0, 91977: 1.0, 92047: 2.0, 92086: 1.0, 92117: 1.0, 92121: 3.0, 92231: 1.0, 92635: 1.0, 92699: 1.0, 92716: 2.0, 92773: 1.0, 93145: 1.0, 93151: 1.0, 93482: 1.0, 93491: 1.0, 93584: 1.0, 93589: 1.0, 93796: 2.0, 93841: 1.0, 93849: 1.0, 94086: 1.0, 94118: 1.0, 94225: 1.0, 94285: 1.0, 94293: 1.0, 94346: 1.0, 94433: 1.0, 94438: 1.0, 94555: 1.0, 94642: 1.0, 94792: 1.0, 95457: 1.0, 95508: 1.0, 95949: 1.0, 95971: 8.0, 96004: 1.0, 96036: 1.0, 96133: 1.0, 96162: 1.0, 96199: 1.0, 96453: 1.0, 96511: 1.0, 96622: 1.0, 96656: 1.0, 97149: 1.0, 97185: 1.0, 97187: 1.0, 97194: 1.0, 97219: 1.0, 97274: 1.0, 97323: 1.0, 97431: 5.0, 97515: 1.0, 97654: 1.0, 97822: 1.0, 97948: 1.0, 98003: 1.0, 98161: 1.0, 98238: 1.0, 98358: 1.0, 98460: 1.0, 98462: 1.0, 98525: 1.0, 98579: 1.0, 98617: 1.0, 98744: 1.0, 98984: 1.0, 99081: 2.0, 99168: 1.0, 99297: 1.0, 99542: 1.0, 99691: 1.0, 99696: 1.0, 99710: 1.0, 99739: 1.0, 99960: 1.0, 100110: 1.0, 100230: 1.0, 100293: 2.0, 100319: 1.0, 100407: 2.0, 100419: 1.0, 100424: 3.0, 100447: 1.0, 100630: 1.0, 100808: 1.0, 100812: 2.0, 100972: 14.0, 101010: 2.0, 101243: 1.0, 101268: 1.0, 101439: 1.0, 101496: 1.0, 101501: 1.0, 101911: 1.0, 102032: 1.0, 102207: 1.0, 102237: 1.0, 102240: 1.0, 102256: 1.0, 102274: 1.0, 102282: 1.0, 102352: 2.0, 102355: 1.0, 102387: 1.0, 102398: 3.0, 102406: 1.0, 102530: 1.0, 102559: 2.0, 102610: 1.0, 102618: 6.0, 102982: 1.0, 103054: 1.0, 103074: 2.0, 103152: 1.0, 103267: 1.0, 103271: 1.0, 103393: 2.0, 103463: 2.0, 103466: 1.0, 103696: 1.0, 103846: 1.0, 103967: 1.0, 104187: 2.0, 104383: 1.0, 104401: 1.0, 104404: 1.0, 104429: 1.0, 104539: 2.0, 104984: 1.0, 105059: 3.0, 105105: 1.0, 105180: 1.0, 105207: 1.0, 105362: 1.0, 105383: 1.0, 105410: 2.0, 105486: 2.0, 105516: 1.0, 105542: 1.0, 105550: 2.0, 105640: 1.0, 105889: 3.0, 106007: 1.0, 106025: 1.0, 106208: 1.0, 106236: 1.0, 106283: 1.0, 106699: 1.0, 106816: 1.0, 107214: 10.0, 107258: 1.0, 107270: 1.0, 107306: 3.0, 107362: 1.0, 107497: 1.0, 107837: 1.0, 107891: 1.0, 108569: 1.0, 108668: 1.0, 108804: 1.0, 108915: 3.0, 108964: 1.0, 109020: 2.0, 109316: 1.0, 109443: 1.0, 109780: 1.0, 109990: 1.0, 110269: 1.0, 110414: 1.0, 110642: 1.0, 110672: 2.0, 110738: 1.0, 110944: 1.0, 110984: 1.0, 111035: 1.0, 111220: 1.0, 111228: 1.0, 111249: 1.0, 111262: 1.0, 111287: 1.0, 111315: 1.0, 111330: 1.0, 111648: 1.0, 111673: 1.0, 111765: 1.0, 111797: 1.0, 112024: 1.0, 112130: 1.0, 112291: 1.0, 112350: 1.0, 112495: 1.0, 112566: 1.0, 112622: 5.0, 112858: 1.0, 112994: 1.0, 113069: 2.0, 113130: 2.0, 113214: 1.0, 113505: 1.0, 113539: 1.0, 113648: 1.0, 113650: 1.0, 113677: 1.0, 113720: 1.0, 113933: 1.0, 114091: 1.0, 114205: 1.0, 114364: 1.0, 114439: 2.0, 114483: 1.0, 114628: 1.0, 114757: 1.0, 114773: 1.0, 114788: 2.0, 114795: 1.0, 114811: 1.0, 114860: 1.0, 114870: 1.0, 114993: 2.0, 115167: 1.0, 115189: 1.0, 115436: 1.0, 115457: 4.0, 115472: 1.0, 115556: 1.0, 115721: 1.0, 115798: 1.0, 115807: 1.0, 115819: 1.0, 115955: 1.0, 116068: 2.0, 116189: 2.0, 116290: 2.0, 116318: 1.0, 116554: 20.0, 116647: 1.0, 116689: 1.0, 116744: 1.0, 116814: 1.0, 116934: 1.0, 116953: 2.0, 117002: 1.0, 117005: 1.0, 117049: 1.0, 117243: 1.0, 117257: 1.0, 117575: 1.0, 117797: 1.0, 117845: 4.0, 117940: 1.0, 117984: 1.0, 118094: 1.0, 118205: 1.0, 118214: 1.0, 118303: 3.0, 118515: 1.0, 118642: 2.0, 118649: 7.0, 118669: 1.0, 118683: 2.0, 118841: 1.0, 118871: 1.0, 118882: 1.0, 119003: 1.0, 119201: 19.0, 119208: 1.0, 119223: 2.0, 119225: 1.0, 119576: 1.0, 119694: 1.0, 119698: 1.0, 119863: 2.0, 119878: 1.0, 119934: 1.0, 120063: 2.0, 120119: 1.0, 120160: 3.0, 120234: 1.0, 120358: 2.0, 120653: 1.0, 120677: 1.0, 120761: 1.0, 120770: 1.0, 120771: 1.0, 120842: 1.0, 120859: 1.0, 121017: 1.0, 121029: 1.0, 121270: 1.0, 121313: 1.0, 121384: 1.0, 121505: 1.0, 121513: 1.0, 121671: 1.0, 121683: 1.0, 121698: 1.0, 121725: 1.0, 121728: 1.0, 121774: 1.0, 122165: 1.0, 122185: 1.0, 122209: 1.0, 122235: 1.0, 122287: 1.0, 122287: 1.0, 122290: 1.0, 122302: 1.0, 122307: 1.0, 122312: 1.0, 122356: 1.0, 122480: 1.0, 122494: 1.0, 122542: 1.0, 122583: 2.0, 122680: 1.0, 122995: 1.0, 123028: 1.0, 123172: 1.0, 123239: 2.0, 123281: 1.0, 123335: 1.0, 123346: 2.0, 123361: 2.0, 123512: 2.0, 123554: 1.0, 123723: 1.0, 123734: 1.0, 123735: 1.0, 123763: 1.0, 123864: 1.0, 123994: 1.0, 124040: 6.0, 124102: 1.0, 124149: 1.0, 124198: 1.0, 124546: 1.0, 124571: 2.0, 124694: 1.0, 124726: 1.0, 124729: 1.0, 125052: 1.0, 125070: 1.0, 125082: 2.0, 125103: 1.0, 125231: 1.0, 125253: 2.0, 125577: 1.0, 125643: 1.0, 125695: 1.0, 125697: 1.0, 125721: 1.0, 125827: 1.0, 125870: 1.0, 125909: 1.0, 125916: 1.0, 125980: 1.0, 126222: 1.0, 126227: 1.0, 126273: 1.0, 126282: 3.0, 126403: 3.0, 126561: 1.0, 126655: 2.0, 126894: 1.0, 126901: 2.0, 126967: 1.0, 126996: 1.0, 127019: 3.0, 127027: 2.0, 127117: 3.0, 127219: 1.0, 127290: 1.0, 127355: 1.0, 127416: 1.0, 127436: 1.0, 127458: 1.0, 127507: 1.0, 127531: 7.0, 127829: 1.0, 127855: 1.0, 127893: 1.0, 127915: 1.0, 127996: 2.0, 128022: 1.0, 128127: 1.0, 128198: 1.0, 128509: 1.0, 128558: 6.0, 128728: 2.0, 128743: 1.0, 128857: 1.0, 128876: 1.0, 128939: 1.0, 129015: 1.0, 129254: 1.0, 129317: 6.0, 129372: 1.0, 129374: 1.0, 129376: 1.0, 129522: 1.0, 129607: 2.0, 129699: 1.0, 129710: 1.0, 129711: 1.0, 129768: 1.0, 129836: 1.0, 129950: 3.0, 129975: 1.0, 130044: 1.0, 130191: 1.0, 130255: 1.0, 130382: 1.0, 130420: 3.0, 130620: 1.0, 130697: 6.0, 130811: 2.0, 130866: 1.0, 130916: 1.0, 130922: 1.0, 130939: 2.0, 130955: 2.0, 130988: 1.0, 131001: 2.0, 131059: 2.0}))" | |
] | |
} | |
], | |
"prompt_number": 134 | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": {}, | |
"source": [ | |
"The number of ngrams found in this particular document: " | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"collapsed": false, | |
"input": [ | |
"len(cv.docvec_rdd.first()[1].values)" | |
], | |
"language": "python", | |
"metadata": {}, | |
"outputs": [ | |
{ | |
"metadata": {}, | |
"output_type": "pyout", | |
"prompt_number": 135, | |
"text": [ | |
"1353" | |
] | |
} | |
], | |
"prompt_number": 135 | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": {}, | |
"source": [ | |
"Let's define a simple histogram of the number of ngrams occuring across the corpus: " | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"collapsed": false, | |
"input": [ | |
"def get_bins(x, binedges) :\n", | |
" import bisect\n", | |
" return bisect.bisect_left(binedges,x)" | |
], | |
"language": "python", | |
"metadata": {}, | |
"outputs": [], | |
"prompt_number": 136 | |
}, | |
{ | |
"cell_type": "code", | |
"collapsed": false, | |
"input": [ | |
"def histogram_rdd(rdd, nbins, minval, maxval) : \n", | |
" import numpy as np\n", | |
" import bisect\n", | |
"\n", | |
" binedges = np.logspace(minval,maxval,nbins+1)\n", | |
" bins = 0.5*(binedges[1:]+binedges[:-1])\n", | |
" \n", | |
" # First figure out which bin each document falls into: \n", | |
" binned = rdd.map(lambda (_,x): get_bins(len(x),binedges))\n", | |
"\n", | |
" # Then add up all the bins: \n", | |
" res = binned.countByValue()\n", | |
"\n", | |
" # This is a sparse result -- turn into a dense vector for plotting: \n", | |
" res_full = np.zeros(nbins)\n", | |
" overflow = 0\n", | |
" for item in res.iteritems() : \n", | |
" if item[0] > len(res_full)-1 : overflow += item[1]\n", | |
" else: res_full[item[0]] = item[1]\n", | |
" res_full[-1] += overflow\n", | |
" \n", | |
" return bins, res_full" | |
], | |
"language": "python", | |
"metadata": {}, | |
"outputs": [], | |
"prompt_number": 137 | |
}, | |
{ | |
"cell_type": "heading", | |
"level": 2, | |
"metadata": {}, | |
"source": [ | |
"Filtering on number of judges" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": {}, | |
"source": [ | |
"Get the histogram of the full ngram count:" | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"collapsed": false, | |
"input": [ | |
"bins, hist = histogram_rdd(cv.ngram_rdd, 100, 1, 5)" | |
], | |
"language": "python", | |
"metadata": {}, | |
"outputs": [], | |
"prompt_number": 138 | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": {}, | |
"source": [ | |
"Now we'd like to filter on words used by 5 or more judges. First we can check what the phrases that fit the selection criteria are: " | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"collapsed": false, | |
"input": [ | |
"judge_vocab_rdd = cv.get_judge_vocab_rdd(5,sys.maxint)\n", | |
"judge_vocab = judge_vocab_rdd.collect()\n", | |
"print len(judge_vocab)" | |
], | |
"language": "python", | |
"metadata": {}, | |
"outputs": [ | |
{ | |
"output_type": "stream", | |
"stream": "stdout", | |
"text": [ | |
"502\n" | |
] | |
} | |
], | |
"prompt_number": 139 | |
}, | |
{ | |
"cell_type": "code", | |
"collapsed": false, | |
"input": [ | |
"judge_vocab.sort()\n", | |
"judge_vocab[0:10]" | |
], | |
"language": "python", | |
"metadata": {}, | |
"outputs": [ | |
{ | |
"metadata": {}, | |
"output_type": "pyout", | |
"prompt_number": 140, | |
"text": [ | |
"['according',\n", | |
" 'accordingly',\n", | |
" 'account',\n", | |
" 'act',\n", | |
" 'acting',\n", | |
" 'action',\n", | |
" 'acts',\n", | |
" 'actual',\n", | |
" 'actually',\n", | |
" 'addition']" | |
] | |
} | |
], | |
"prompt_number": 140 | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": {}, | |
"source": [ | |
"Generate an RDD that contains the filtered ngrams for all the cases: " | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"collapsed": false, | |
"input": [ | |
"filtered_ngram = cv.filter_by_rdd(judge_vocab_rdd)" | |
], | |
"language": "python", | |
"metadata": {}, | |
"outputs": [], | |
"prompt_number": 141 | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": {}, | |
"source": [ | |
"We can plot the distribution of ngram counts for the filtered vs. unfiltered data: " | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"collapsed": false, | |
"input": [ | |
"bins, hist_filtered = histogram_rdd(filtered_ngram, 100, 1, 5)" | |
], | |
"language": "python", | |
"metadata": {}, | |
"outputs": [], | |
"prompt_number": 142 | |
}, | |
{ | |
"cell_type": "code", | |
"collapsed": false, | |
"input": [ | |
"plt.plot(bins, hist, label = 'full')\n", | |
"plt.plot(bins, hist_filtered, label = 'filtered')\n", | |
"plt.xlabel('\\# of ngrams')\n", | |
"plt.semilogx()\n", | |
"plt.legend()" | |
], | |
"language": "python", | |
"metadata": {}, | |
"outputs": [ | |
{ | |
"metadata": {}, | |
"output_type": "pyout", | |
"prompt_number": 143, | |
"text": [ | |
"<matplotlib.legend.Legend at 0x113129650>" | |
] | |
}, | |
{ | |
"metadata": {}, | |
"output_type": "display_data", | |
"png": "iVBORw0KGgoAAAANSUhEUgAAAlUAAAGOCAYAAAC+F/bDAAAABHNCSVQICAgIfAhkiAAAAAlwSFlz\nAAALEgAACxIB0t1+/AAAIABJREFUeJztvW2MHNd95vv0TM8Lp0Vy+DIj7bUUkbSsONIGMSltrm8S\nW4zJdW6Q3f0QUjaM7AJ3cyNaQD5cYxG9bHCD8O4Ca70EWAMXyIqksciu70UgSvKH9U02kUhlpBiO\ns5JIGYisxKb4IkuyxOHLUFIPh93T3ffD6X/36ep6OVV1qk5V9/MDBjPdXV19pqam6+n/c87/AQgh\nhBBCCCGEEEIIIYQQQgghhBBCCCGEEEIIIYQQQgghhBAyOux0PQBCCCGEkKIwYbjdLgBtz9eBrAZF\nCCGEEFI2KobbPQXgVc9937Q8FkIIIYSQkWYewHHXgyCEEEIIKTuPQ9l9V6AqVrvdDocQQgghpHhM\nGmyzGcAKgFsA/FMAXwXwHoBTGY6LEEIIIWSk2QdVsWqDq/8IIYQQQlKxE0pUPeZ6IIQQQgghRaGa\n4DnnAJyAsgX9WEo8GkIIIYSQfNlra0dJRBUAXIOaZ+VlaXZ29r6bb765d8f8/Dzm5+cTvkx2rKys\nZD4uG6+RdB9xnmeybdQ2YY8nfaxIZD1OW/tPsh/b54rJdknOiaz/Bu++C5w5A3z2s8DMTPL98L0l\n3rZ8b+F7S5xtbZwvKysrWFlREub999/HjRs3zgC4w2igGXEGwGd87l+6/fbbO2XgD//wD0vxGkn3\nEed5JttGbRP2eNLHikTW47S1/yT7sX2umGyX5JzI+m/wH/9jpwN0OufOpdsP31vibcv3lj8sxf5H\n9b3lvvvu6wA4b0scRa3+2wPgv0Kt9jvXve9Q9/b/67P9/zY/P7/ja1/7mq3xZcqOHTtK8RpJ9xHn\neSbbRm0T9njQY0tLS9i7d2/kaxeBrM8XW/tPsh/b54rJdnHPl6zPle9+F3jhBeB3fxfYti3dvvje\nEm9bvrfsKMX+R/G95U/+5E9w4cKFawC+YfTiEUR1VN8J4AWomJoTUG0UngfwYsD2S7fffvt958+f\ntzE2MgYcPnwYhw8fdj0MUgKyPle+/nXg938feOMN4K67MnsZkhN8byEm7N27Fy+99NIFADts7C9q\nTtU5xPQZy+Bhk+JQlk+SxD1Znyvr6+p7o5Hpy5Cc4HsLiYHfHPFEmAYqG0NRReLANz5iSl6iqtnM\n9GVITvC9hcSguKKKEELKCEUVISQtSVsqEELISCFiivYfsc3WrVtx9epV18MYa7Zs2YIrV65k/joU\nVYQQAlaqSHZcvXoVnU7H9TDGmkolal2eHWj/EUIIKKoIIemhqCKEEHD1HyEkPRRVhBCCfoWKlSpC\nSFIoqgghBLT/CCHpoagihBDQ/iPEJisrKzh9+nTg7VGFoooQQkD7j5Aojh49imPHjuGOO+7AsWPH\nArc7ceIE7rnnHjzyyCO+t0cZtlQghBDQ/iMkjFOnTuGJJ57AmTNncO+994b23dq/fz/uv/9+nDp1\nyvf2KENRRQghoP1HSBivvvpqL4Zu9+7dkdt7+3KNS58u2n+EEAJWqggJ4rnnnsOzzz6Ls2fP4skn\nn8TJkydx9OhRTExM4Nvf/jYA4IknnsDExARef/11x6N1CytVhBACzqki7vna14AsNMlnPgN84xvJ\nn3/gwAGcPXsWV65cwUMPPQQA2LdvHx588MHeNg8//DAeffTRtEMtPaxUEUIIaP8REsa42HdpYaWK\nEEJA+4+4J001iRQDVqoIIQQUVYRkSV6Bxq6hqCKEEPTFFO0/Qoa5fPkyVlZWBu7btWsXnn/+eQCq\nhxUAvPXWW77PHxf7kKKKEELAShUhQZw+fRonT57EuXPn8M1vfrN3/+OPP47jx4/jU5/6FD75yU9i\nz549ePXVV3H69Gk899xzeO211/Diiy8O3R5lOKeKEEJAUUVIELt378arr746dP+BAwdw4MCB3m19\nmzNnzgxs6709qrBSRQgh4Oo/Qkh6KKoIIQTsU0UISQ9FFSGEgPYfISQ9FFWEEALaf4SQ9FBUEUII\naP8RQtJDUUUIIaD9RwhJD0UVIYSA9h8hJD0UVYQQAlaqCCHpoagihBBwThUhJD0UVYQQAtp/hJD0\nUFQRQsaedlt9AaxUEZKElZUVnD592slrnz17FufOnXPy2l4oqgghY0+r1f+ZoooQf06cOIEnn3wS\nx44dw7333ovnnnuud/8999yDRx55JPcxHT16FHfccYczQeeFoooQMvboQor2HyHDnD17Fg8++CAe\neughPPDAA3jxxRexa9cuAMD+/ftx//33D2x/6tQpnDx5MvNxHTp0CPPz85m/jikUVYSQsUfmUwGs\nVBHix4kTJ3oiCgA2bdo0cLvT6fR+Pnv2LPbt25fr+IpC1fUACCHENRRVpAh87S++htfff936fj9z\ny2fwjf/1G4mff/LkSRw9ehQrKyt49NFHAQDPPPMMtmzZgldffXVo+1OnTuHatWs4cuQIzp49iwce\neKBXuXrrrbdw9uxZHDlyBDt37sSzzz6LI0eO4Etf+hKOHz+Oc+fO4cyZM4HbA0q0PfHEE/jkJz+J\nS5cuYWVlJfHvZhuKKkLI2COiqlql/UeIl3379mH//v04deoUHnvsMQDAtm3b8PTTT/tuf/DgQQDA\ngw8+iC984Qs9Mfb8888DAL74xS/iq1/9Kp5//nkcPHgQDzzwACYmJnDs2DGcOnUqdHu5ferUKWza\ntAnnzp3Dk08+mfUhMIaiihAy9kh1am6OlSrijjTVpKzR7T2/22EcP34cAHriZ9euXQOr9bZt24b7\n778fO3bswI4dO3D06NHA7Z999lnMz89j06ZNAICdO3cWak4VRRUhZOyRStXcHHD5stuxEDJqvPba\na9izZw8eeuih1Nu/8sor2Lp1q+0hWoOiihAy9oio2rBBVao6HaBScTsmQspEJeQfZsuWLThx4sTA\nfSsrK6hUKti8eXOs7bdv3+47j6socPUfIWTs0e0/YHDiOiFEceXKlcDHvHbgrl278NZbb+HatWv4\n8pe/jFOnTuHBBx/E2bNnceLECTz66KM9QdXpdAaeH7b9gQMHsLKygmPHjgFQYmtlZQVnz57N4DeO\nD0UVIWTs0StVAOdVEaJz7tw5nDx5EqdPn8aLL76Ic+fO4fjx473bp0+fxnPPPYfXXnsNL774IgA1\nWf2RRx7BM888g927d+PIkSM4ceIE7rjjDhw7dgxPPPEEADVH6ty5czh69GivgWfY9rt27cIzzzyD\nxx9/HPfeey+OHTuGPXv24PLly7h27ZqbA6Rhu8C9dN999923tLRkebeEEJIdr78O7N4NfP7zwMsv\nA1evAgWa+0pKTqVSiTWxm9gn6G+wd+9evPTSSy8B2GvjdVipIoSMPfpEdYCVKkJIMiiqCCFjj4go\n2n+EkDRQVBFCxh7vnCo2ACWEJIGiihAy9tD+I4TYgKKKEDL2UFQRQmxAUUUIGXu8c6po/xFCkkBR\nRQgZe1ipIoTYgDE1hJCxh6KKZMmWLVtCY1xI9mzZsiWX16GoIoSMPVz9R7IkLN6FjBa0/wghY483\n+4+VKkJIEiiqCCFjD+0/QogNKKoIIWMP7T9CiA2SiKrHATxleyCEEOIK2n+EEBvEFVV7ADwEgHHb\nhJCRwVupoqgihCQhrqg6BOBUFgMhhBBXeOdU0f4jhCQhjqh6CsAjANhsgxAyUnCiOiHEBqai6iCA\n5wFcy3AshBDiBG9MDUUVISQJJqJqHsB+AN/u3uZ8KkLISMHVf4QQG5iIqscAPJz1QAghxBWcqE4I\nsUGUqDoE4BkAH2r3VcB5VYSk5uEXHsZ/Pv2fXQ+DQImqahWYnla3KaoIIUmIyv47BNVGwcvu7mOH\nAHxTf2BlZQWHDx/u3d67dy/27t2bapCEjCLPvfkc9vyjPfjt3b/teihjT7OpRNXUlLpN+4+Q0WVp\naQlLS0sAgPPnzwNqmpMVokTVPgB6tHMFwAsAXoNaCXjV+4T5+fkBUUUI8afRauDG+g3XwyDoV6oq\nFfWdlSpCRhe92LO0tIQLFy6s2Np3lKi6huEVf9cAXAFw3tYgCBlHmq0mbrQoqorA+nq/SjU1RVFF\nCElGkpgarv4jxALNdpOVqoIglSpAiSraf4SQJERVqvy41/ooCBlDGq0GK1UFQeZUAWqyOitVhJAk\nJKlUEUIs0GyxUlUUvJUqiipCSBIoqghxRLPNOVVFwTunivYfISQJFFWEOKDVbqHdabNSVRBo/xFC\nbEBRRYgDmm111WalqhjQ/iOE2ICiihAHNFvqqr22vuZ4JATg6j9CiB0oqghxQKOlrtq0/4qBPqeK\n9h8hJCkUVYQ4gPZfsdDnVNH+I4QkhaKKEAeI/ddoNdDpsJ+ua2j/EUJsQFFFiAPE/vP+TNxA+48Q\nYgOKKkIcIPYfQAuwCHD1HyHEBhRVhDhA7D+Ak9WLgHdOFe0/QkgSKKoIcYBu+bFS5R69UkX7jxCS\nFIoqQhwwYP+xUuUcb0wNRRUhJAkUVYQ4QLf/2ADUPVz9RwixAUUVIQ6g/VcsmP1HCLEBRRUhDqD9\nVyy4+o8QYgOKKkIcMLD6j5Uq53jnVNH+I4QkgaKKEAcM2H+sVDmH9h8hxAYUVYQ4gM0/iwXtP0KI\nDSiqCHEAm38WCz9RxUhGQkhcKKoIcQBX/xULb/af3EcIIXGgqCLEAVz9Vyy8MTVyHyGExIGiihAH\nsPlnsfDafwBXABJC4kNRRYgDaP8Vh3ZbfXntP1aqCCFxoagixAG0/4pDq6W+0/4jhKSFoooQB7D5\nZ3EQ8UT7jxCSlqrrARAyjjRaDVRQQXWiykqVY2SVn978E2ClihASH4oqQhzQbDcxPTmN6clpVqoc\nI6JKj6kBKKoIIfGh/UeIA5qtJqYmpzBTnWGlyjHeShXtP0JIUiiqCHFAo9XA1MQUZiZnWKlyjHdO\nFe0/QkhSKKoIcYDYfzPVGfapckxQpYqiihASF4oqQhzQs/9YqXJO0Jwq2n+EkLhQVBHigEa7a/9x\nTpVzaP8RQmxBUUWIA5qtrv3HSpVzaP8RQmxBUUWIA5ptrv4rClz9RwixBUUVIQ7g6r/i4J1TRfuP\nEJIUiipCHCD232x1lpUqxwTF1FBUEULiQlFFiAMG7D9WqpxC+48QYguKKkIcMGD/sVLlFNp/hBBb\nUFQR4gB99R+bf7qFq/8IIbagqCLEAbT/ikPQnCraf4SQuFBUEeIA2n/FwVupov1HCEkKRRUhDujZ\nf6xUOScopoaiihASF4oqQhzQs/8mZ9BoNdDpdFwPaWzh6j9CiC0oqghxQM/+q870bhM3eOdUVSrA\n5CQrVYSQ+FBUEeIAvfknAFqADvFWqgA1r4qiihASF4oqQhzQbDd7E9UBcLK6Q7xzquRn2n+EkLhQ\nVBHigEar0WupALBS5RKv/QcoUcVKFSEkLhRVhDhAb/4JgA1AHUL7jxBiC4oqQhzQs/+qtP9c4yeq\naP8RQpJAUUVIzrTaLbQ77V5LBYD2n0uC5lSxUkUIiQtFFSE502yrq7U0/wRYqXKJ35wq2n+EkCRQ\nVBGSM82WuloPrP5jpcoZtP8IIbagqCIkZ6TR58DqP1aqnCGianKyfx/tP0JIEkxF1cMArgBoA3gV\nwO7MRkTIiKPbf2z+6Z71dVWlqlT699H+I4QkwURUPQbgDIAdAO4BMA/gZIZjImSk8bX/WKlyRrM5\naP0BtP8IIcmIElWbAbwA4NsAPgRwGsDjUMJqU7ZDI2Q08bX/WKlyhlSqdGj/EUKSECWqrmG4KlUB\n8BqUyCKExGRg9R+bfzpnfX2wnQJA+48Qkoy4E9XnAewHsC+DsRAyFgzYf5yo7pygShXtP0JIXKrR\nmwBQNuBXATwKJawA4EuZjIiQEWfA/mNLBecEzalipYoQEhfTStU1AE9AVahOADgI4IGsBkXIKMPm\nn8XCr1JF+48QkgTTSpVwGsAXodor7AdwzPqIyNjxwccf4J//6T/Hc196Drdtvs3qvg8cP4Dvvv3d\nofu3zG7B9/7372Hrhq1WX88E3f6bnpwGwEqVS/zmVLm0/37rt4B/9s+Ar3zFzesTQpITV1QJxwFs\n8XtgZWUFhw8f7t3eu3cv9u7dm/BlyDjwdxf/Dq+89wp+8MEPrIqqTqeDP/vRn+Huxbvxi//TL/bu\nf/ejd/GdH30Hf3/p7/FLt/2StdczRbf/JioTmJqYYqXKIUWy/9pt4OmngU2bKKoIyYqlpSUsLS0B\nAM6fPw/0pzWlJqmo2gXgKb8H5ufnB0QVIVHUm3X1vVG3ut+PGh/hRusGvvKPv4Lf+6Xf691/6qen\n8J0ffQcX6xetvp4puv0HADPVGVaqHFIk++/qVaDVAm7wdCAkM/Riz9LSEi5cuLBia98mfaqOYLCD\n+h4AV6F6VxGSGhFTIq5ssVxfBgAszC0M3C+35fG80e0/AJitzrJS5ZAirf5b7p6SFFWElBOTStU9\nUH2pngXwCoCzAL6c5aDIeLHaXB34bovl1a6oqnlEVfe2PJ43uv0HADOTM+xT5ZCgOVUuKlUUVYSU\nmyhRdQ3AvXkMhIwvWdl/Yu8t1hYH7p+tzmLj9EbafwSA/5wqV/bfxe4pSVFFSDmJ2/yTEOvkbf8B\nqlrlqlLltf9mJimqXBIWU9Pp5DsWVqoIKTcUVcQ5edt/gBJaruZUDdl/1RnOqXJIkP0nj+UJRRUh\n5YaiijgnS/uvNlXD3NTc0GOLtcXi2H+sVDklaPUfkL8FKPbfGqfYEVJKKKqIczKz/1aXfatUQLdS\nVRT7j5UqpwT1qQLyXwHIShUh5YaiijhndT0j+6++7DufCujOqaovo5P3pBn4r/5jpcodQXOqgPwr\nVRRVhJQbiirinKwqVRfrF4dW/gmLtUU0201cu3HN6mua4Lv6j5UqZ/jNqXJt/1FUEVJOKKqIc7Ka\nUxVl/wFuGoD6Nv9kpcoZYZUq2n+EkDhQVBHnZLH6r9PpRNp/gJsGoI1WAxVUMDkxCYDNP10TNqcq\nz0pVuw1cuqR+pqgipJxQVBHnZGH/Se5fmP0HwMkKwGa72bP+gO6cKtp/zijK6j/J/atWKaoIKSsU\nVcQ5Wdh/YY0/9ftd2X8ySR1gR3XXhPWpytP+E+vvE5+gqCKkrFBUEedkYf+FNf7U73dl/8l8KoCV\nKtcUxf4TUXXrrep12+38XpsQYgeKKuKcLOy/oNw/wWX+35D9x0qVU4pi/8nKv9tuU99ZrSKkfFBU\nEeeImFpbX0Or3bKyzyj7D3CX/zdk/03OoNFqOOmZRYqz+k+vVAEUVYSUEYoq4pT19joarQY2Tm8E\nAFxfv25lv1H2H+Au/6/R9th/1Rl1fyvn9fsEQPicKhf23yc+ob5TVBFSPiiqiFNkHpWIH1uT1cNy\n/wRX+X/N1vDqPwC0AB3hN6fKlf03Pw9sVJ8vKKoIKSEUVcQpPVHVtelszasKa/wpuMr/a7YH7b/Z\n6iwAcLK6I4pk/y0sADNKY1NUEVJCKKqIU6QyJRPKba0ADGv8KbjK/xta/de1/9gANH/abfVVFPtv\ncZGiipAyQ1FFnCKVqV6lyqL9F7TyT3CV/0f7rzi0uusiimL/sVJFSLmhqCJOGZpTlbP9B+TfANRr\n/0mlivZf/ohoov1HCLEBRRVxShb2X1Tun+CqAahf80+AlSoXrK+r766bf0ruH+0/QsoNRRVxShb2\nX1Tun+Aq/2/I/mOlyhkiqrxzqvK2/yT3T69UrXGKHSGlg6KKOCUL+8+k8af+uHP7j5UqZ0RVqvKy\n/6RH1cICMKsWg7JSRUgJoagiTsnC/jNp/Kk/7tz+Y6XKGVFzqvKqVImoov1HSLmhqCJOycL+i8r9\nE1zl/3H1X3EIqlTlbf9J7h8nqhNSbqrRmxCSHVKZ2ja3DUC+9h/gJv8vqPkn+1TlT9CcKpf2X7ut\nfqaoIqR8sFJFnFJv1FFBBRuqGzA3NZer/Qe4yf+j/Vccguy/SgWYnMzf/tu+nZUqQsoMRRVxSr1Z\nR226hkqlgtpUzZr9F5X7J7jI/6P9VxyC7D9AWYB52n/z8+o1KaoIKS8UVcQpq83VnviZm5qzY/8Z\nNP4UXOT/NdtNVqoKQpiomprK1/5b6J6yFFWElBeKKuKUerOO2lQNAFCbrtmx/wwafwou8v8arQZb\nKhSEoDlVcl+e9t9id12FTJKnqCKkfFBUEafUG8r+A6DsPwuVKpPcP8FF/h+bfxaHoDlVQP72n1Sq\nKhX12mz+SUj5oKgiThmy/yzMqYpr/wH5NgD12n8isFipyp8i2n+AagDKShUh5YOiijjFtv1nmvsn\n5N0AtNVuod1pD9h/E5UJTE1MsVLlgCLYf3runzAzQ1FFSBmhqCJOsW3/meb+CXnn/zXb6iqt23+A\nsgBZqcqfIqz+03P/BIoqQsoJRRVxim37L07jT327vOy/ZktdpXX7D1ANQNn8M3/C5lTlZf/pjT8F\niipCyglFFXHKgP03ld7+i9P4U98uL/uv0VJXad3+A9QKQNp/+RM1pyqPSpWe+ydQVBFSTiiqiFPq\njcE5VWntP9PcPyHv/D/af8UibE5VXvafnvsnUFQRUk4oqohTvPbf2voaWu1W4v3Ftf+AfPP/guy/\nmUmKKhcUYfUf7T9CRgeKKuKMVruFG60bAxPVAeD6+vXE+4xr/wH55v8F2n9V2n8uiJpTlaf9t317\n/z6KKkLKCUUVcYZYfbr9ByDVZPU4uX9Cnvl/gfYfK1VOKMLqPz33T6CoIqScUFQRZ8ikdN3+A5Bq\nXlWcxp9Cnvl/gfYfK1VOiOpTlZf9t+A5ZWdn2VGdkDJCUUWcIRUpr/2XZgVgnMafQp75f6Gr/1ip\nyp2i2H+LnnUVrFQRUk4oqogzsrL/TFf+CXnm/4Wu/mOlKneKYv95K1UUVYSUE4oq4owi2X9APg1A\n2fyzWBRl9R9FFSGjAUUVcYZt+y9u7p+QZwNQ2n/FwnX2n1/uH0BRRUhZoagizrBt/8XN/RPyzP8L\nXf1H+y93wuZU5WH/+eX+ARRVhJQViiriDNv2X5LGn/r2Lu0/dlR3g2v7z6/xJ0BRRUhZoagizrBt\n/yVp/Klv79z+Y6Uqd0RUTU4OP5aH/eeX+wcoUdVsKnuQEFIeKKqIM2zbf3Fz/4Q88/+Y/Vcs1tdV\nlapSGX4sD/vPL/cPUKIKyGeiPCHEHhRVxBle+2+2Ogsgf/sPyC//Lyz7r9Fq5NIri/RpNv2tP6Bf\nqcryTxJk/82qfwU2ACWkZFBUEWfUG3VUUOmJqYnKBOam5nK3/4D88v/Csv/0x0k+SKXKD1kRKBZh\nFvjl/gH9ShXnVRFSLiiqiDPqzTpq0zVUNO+lNlVLZf/Fzf0T8sr/C1v9B4AWYM6sr/u3UwD6WXxZ\nWoB+uX8ARRUhZYWiijhjtbk6JIDmpuaS238JGn8KeeX/hTX/BMAGoDljUqnKcl6TX+NPgKKKkLJC\nUUWcUW/We5PUhdp0Lbn9l6Dxp5BX/l+U/ccVgPkSNadKtskKv9w/gKKKkLJCUUWcUW/Ueyv+hNpU\nLXGlKknun5BX/h/tv2IRVqnKy/5jpYqQ0cFUVD0O4AqANoBXAezObERkbPCz/2rTyedUpbX/gOwb\ngIY1/wRYqcqbsDlVtP8IIXExEVVHoMTUQQCPANgD4DUAmzMcFxkD/Oy/pKv/kub+CXk1AG20Gqig\ngsmJwW6TrFS5waX9F5T7B1BUEVJWokTVLgA/BvBvAbwI4EkA93cfO5ThuMgYYNP+S5r7J+SV/9ds\nN4esP4CVKle4tP+Ccv+Afp8qiipCykWUqNoM4I889z3X/b7V/nDIOGHT/kvT+FN/Xh72n3eSOsBK\nlStc2n9BjT+BfqWKzT8JKRdRoup0yGOv2BwIGT987b9qMvsvTeNP/Xl52H/e+VQAK1WuMGmpkFWl\nKij3D6D9R0hZSbL67yCAtwB82/JYyJhRb/i3VEhi/yXN/RPyyv8LtP+6lSr2qcqXsDlVWdt/Qbl/\nAEUVIWUliah6FP15VYQkxtf+m6phbX0NrXYr1r7S2n9APvl/jVbD1/6T5p+0//LFZfNPE/uPooqQ\nchHwdhLI4wD+A4DXMxgLGSNa7RZutG4MTVQXkXV9/Tpumr5p6Hk31m9g95Hd+MmHPxm4X5pqJrX/\nAODm2s14+u+exn/7h/82cP9sdRZ/+S//Env+0Z7Q57c7bXzhv3wB/+Z/+Tf4Fz/7L3y3ababmdt/\nx984jj95/U/w57/156n3lRe/9VvA5z4HPPhg9LZ/8AfA9evAH3lneybAZE6VX6Xq7beBX/914C/+\nArjttvDXWFsDfuEXgPfeG7xfxJo39w9wL6peeAE4fBhYWgo+PoSQYeKIqkMA/hYRtt/KygoOHz7c\nu713717s3bs3ydjICCMWn5/9Byhr0E9UvfvRu3jz0pv49Tt+HT+3/ecGHvvUtk8lyv0T/v2v/nv8\n+Y8HhUi9WceR147g1E9PRYqqq9ev4qULL+GXb/vlYFHVCrf/bFSqvveT7+G/n/nvaHfamKiUo7/v\nX/wFUKmYiaq/+iu1cs6WqJoLOGXC7L9Tp4Af/hB4441oUfX++8CPfgT82q8Bd989+NinPz2c+we4\nF1Uvvwx873vABx8At97qZgyEZMXS0hKWlpYAAOfPnweAeVv7NhVVBwF0MCyodgI4p98xPz8/IKoI\n8UMmo/vZfwAC51WJzfe7/+R38Rt3/obVMe3btQ/7du0buO968zqOvHbEaFWgWIdhE+2D7D+blSp5\n/bX1tVQiM09WV9WX6bbLllxakz5VfvafvL7JmGWbf/2vgS9/2WxcrkWV/H7LyxRVZPTQiz1LS0u4\ncOHCiq19m3yM3Q/gq1Ad1Q9qX0917yMkNtI2Icj+CxImaSekx2XD1AbcNH2T0QR22SZson2g/Wex\nUiWvn7Qzfd60WsoiqxsOt15XTTNb8abd+ZJ09Z9MMjcZs2xTq4VvpyPVK1eiSn6/i9mu2yBk5Iiq\nVO0B8DxUlWqf57EXAGQblEZGFhP7z4+0rROSsDBnNoFdqlmhoirI/rNYqZJjV2/WsYD8jlNSpJIT\nR1R1OsBj1LnTAAAgAElEQVSVK/6TvOMQNqcqzP6TSk5WoqpSUdWqIlSqCCHmRImqU2DoMsmAtPZf\nmlV+cTFdFZjG/hOhZaNSJa+fpN+XC0RUxbH/gODcvDgkXf2XxP4LmrsVxMyMu+afFFWEJIOCiTgh\njf03NzU39LwsWawtxrP/Qmy3IPtvojKBqYkpO5Wqktl/UsmJU6kC7FhTSbP/srb/ALeVKtp/hCSD\nooo4IY39l2eVCujafyYT1VPYf4CyAG00/9TtvzIQR1Str/crRzaqKEmz/7K2/wB3oqrZVKsrAVaq\nCIkLRRVxQmL7b3U51/lUQH9OVafTCd0ujf0HqH5YtP/MtgXsiaok2X952X8uRNXly/2fKaoIiQdF\nFXFCGvsvr5V/wmJtEY1WAx/e+DB0uzT2H6BWANL+M9sWcGf/dTqjXanSjyvtP0LiQVFFnJDY/qs7\nsP8Mw5bl8TT2n5WWCiW1/xoNVTky2RZwZ/9du9a/L46oKkulSo7rJz7BShUhcaGoIk4Isv82VDcA\n8BcEnU7H2ZwqAJHzquTxpPbfzKQdUVVW+8/7c9S2ruw//XVN7b/ZWWAi5ruta1F1992sVBESF4oq\n4oR6o44KKr0gYaFSqWBuas63UvVx42Osra85sf8AhK4AbHfauLR6CUAK+6+a3v5rtppotpuR4ygS\nerUnqvJj2/4Lq1RVKsDk5HClSn9d00pVXOsPcG//3X038NFHDHUmJA4UVcQJ9WYdtekaKpXK0GO1\nqZpvlcVF40/99cLsv6vXr6LVaWH73HY02000Wz5LxhBh/1moVOkVvrLZf96fw7bdvt1OpSpsThWg\nLECvqJLX3b49W1E1O+uuUjUxAdx5Z/82IcQMiirihNXmamAuXW265isIXDT+1F8vzP4TwbVzfieA\nYOut0WpkWqnSX3eU7b+dO7OfUwUoCzDI/tu509z+izufCnDX/HN5WQnGW25Rt2kBEmIORRVxQr1Z\nH5qkLsxNzfmKqrxz/wST/D957Pb52wEEV4ma7Wamc6p0y2+U7b/bb0+f/9duq6+gOVWAeizI/rv9\n9tG1/xYW+t3qWakixByKKuKEeqMe2BW9aPYfEJ3/J1WsHZt3AAgWNFk3/xwX+2/Hjn7+X1JEkCWx\n/zZtArZsGU1RJfE/FFWExIeiijgh0v7zESWu7D8gOv+vZ/9tSW7/zVZnaf/FsP+AdBd8EUtJ7L+F\nBSWUsrb/XImqxUX1BdD+IyQOFFXECUntv7xz/4So/D957Gc2/wwA/ypRq91CB5387L8RrlTdrlzW\nVBd86YkVJar87D8RVfW6qpiFUbZKlfx+mzer35+VKkLMoagiTkhq/7moUgHR+X/L9WXMz85jfnYe\ngL/9J60OQlf/paxUiZDaOL2xVHOqNm7s/xy1baUC3Habup3mgi+iKmxOVZD9t7ioqk/tdrTwKZOo\nkty/hQV1nG2tsiRkXKCoIk5IZP85yP0TovL/RPBJ9c1PFDZaykcKXf2XslIlr7tYWyyV/SdWk4n9\nV6v1t7chqtLYfzKmMMpk/0nunxzfxUXaf4TEgaKKOCHU/qsG2395r/wTovL/LtYvYqG20I/Z8Rm/\n9K4Ktf/SVqq6YnShtlAq+08mRZtUqmo1YNs2dTvNBd90TpVeqZLcP11UmY45LjMz6rXb7fjPTYoc\nT/l7LCywUkVIHCiqiBPqjWBRVZsOsP8c5P4JUQ1Al1eXsVhb7FXfEtl/FipVIqQW5hZKZf9t3ars\nJhOBMjenxM7WrdlXqrz2n+T+if0nYwqi3e5X1+Iy2w0b8FbKskSOJ0UVIcmgqCJOWG2uhs6pWltf\nQ6vdb0IkuX8uK1VAcANQEXyp7L/JGTRajUCL0YSy2n+1mtlqOl2gpL3gm8yp8tp/uugwsf+keWdS\n+w/I1wKUShXtP0KSQVFFcqfVbuFG60bgnCq5XxcFkvvncqI64J//J7l/C3Mp7b+quoqmqVbVG3VU\nJ6qYn50vlf0noiqOlbawkL/9p9tjJvafPJbU/gPy7aruV6li/h8h5lBUkdyRi32Y/QcMiiqXjT/1\n1/Wz/yT3b7G2iKmJKUxWJhOv/gOQal5VvVnH3NQc5qbmsNpcRbuT44SchIilNzcXLar0Sd+Li/nb\nf/J6pvafVLHSiKo8BY3k/m3dqm6zASgh8aCoIrkjYinM/gMGqz1iu7my/8Ly/3TBV6lUAueERdl/\ns1U1iSZNpWq1uYraVK13DNN2aM+DOPaft1JVdPtPBFeZ7L/t25WwAtgAlJC4UFSR3JEqTpT9p1d7\nxHZzZf+F5f95x1ab8g+ENrb/UlaqatO1vg1Z8MnqrZayt5Laf2ny/5I0/3Rh/+VdqVrQ/sVYqSIk\nHhRVJHfKaP8Bwfl/3ipaUEd4Y/sv5Zwqsf+A4ndVlypPUvsvTf6fyZwqP/tv0yYleEbV/qOoIiQ5\nFFUkd9LYf64qVYASTb6iyiP4ktp/NipVXvuv6CsAddGRxP4Dkl/wkzT/1EXHqNp/i5rDTvuPkHhQ\nVJHcSWr/ucr9ExZqC6H23/a57QC69p/fRHWD5p9AykpVyew/3R5LYv8ByS/4pnOqvPafvK5Jpars\n9h/z/wiJB0UVyZ2k9p/LKhUQnP8nuX9i6yW2/2zMqRph+299XVWNRMzkUanys//kdaem1FdW9p80\n/8xLVOm5fwLz/wiJB0UVyZ1E9p/Dxp+C2H/e5pxewZfY/rNQqSqb/eetVIVZaV6Bkjb/z7RPldf+\n0+2xqDGXyf67dEl9X/T8m7EBKCHmUFSR3Elq/7mcpA6oSpVf/p83kzCx/detVKVpgyCZiqNo/3mt\ntLT5f3FX/+m5f0LcMcch7+af3safAqNqCDGHoorkTiL7z2HunxDUAHR5dXlA8KVe/WfB/vOr9hUR\n3f6r1VRVSMRO2LaAEjxbtqS3/8LmVOn2n+T+6aIjyrL0jjkOeVeqKKoISQ9FFcmdKPtvQ3UDgL4g\ncJ37JwTl/3kFX23KcfPP6Zpv1E8R0Ss5IjyC7DS/qk+arupxKlVSpZLXFEzsv9nZfjPNOOQtqry5\nfwLtP0LMoagiuVNv1FFBpVeZ8VKpVFS1p2tduc79E/zy/yT3b8D+m05n/yWtVDVbTTTbzVLbf/p9\nYdsKafL/TOdUAUqA6Y0/BRP7L4n1BxSrUsX8P0LMoKgiuSPL/iuVSuA2erWnCI0/9dfX7T/J/dMF\n39zUHJrtZk9ECVk3/5TKXllX/0WJKj8rLY01Zbr6D1ACzE90mNh/ZRJVeu6fwAaghJhDUUVyR1ao\nhVGb7ke9uM79E/zy//wEX9DKu6ybf+q26kRlArPVWdp/IZj2qQLUXK+k9l+S+VSAG/tPz/0T2ACU\nEHMoqkju1Jv1wJV/gj7Z23Xun+CX/yc/e+0/YLhKlHXzT7H6RNQFrUIsEiKUNmxIbv8lzf+LY/81\nm/nbf1Ily7NS5bX+AFaqCIkDRRXJnXqjHtkZvYj2HzCc/+cXn+PXEgLIvvmnbv8Bg9W+oiJZfpVK\ncvsvaf5fEvtv48Z+BUnGkpX9V6mo16KoIqQ8UFSR3DG2/xqD9p/rShUwnP9n0/4TsZW0UuVdVTk3\nNVcK+09ER1L7D0h2wU9i/3lXxmVp/wH5iipv7p9A+48QcyiqSO4ksf9c5/4J3vw/b+4fEG7/VVDB\n5MSk774nKhOYmphK3PzT1/4reKVKF1VJ7T8gnaiKY/95Kzli/3ma7A+MOWmlClCiKs/mn36VKub/\nEWIORRXJnST2XxGqVMBw/p839w8It/+CrD9hpjpj1/4r+Jwqsf8AM/tPLDEhTaiyyZwqr/3nFR1z\nc0C7HVxNSmP/AflVqvxy/wTm/xFiDkUVyZ3Y9l8BGn8K3vw/P8EXZv8FTVIXZquztP9C7L9aTV3k\nBRv2n0mlKsz+ixpzGey/oNw/gQ1ACTGDoorkjpH9Vx20/4owSR0Yzv/z5v4B4fZf0HwqYWYyRaVq\nDOw/b9VH8v/SiKpJfzcWwLCo8rP/ZGx+2LD/8hBVQY0/BUbVEGIGRRXJnXqjblSp6tl/Bcj9E7wN\nQL25f4AF+89C80+gfPbfzIzqkRRm/3mrPpL/l6SKsr6uqlQhPWh79t+lS8O5f0B/PH5j7nTKY/9R\nVBFiB4oqkjuSTxdGbaqGtfU1tNqtwtl/QH9Fop/gS2P/zUwmF1VD9l+1XPZfpaJESpT95yVpA9Bm\nM9z6A/qVqnff7b+WTpj9d/26+l4G+y8o90+g/UeIGRRVJFda7RZutG4Yrf4DlL1WhNw/Qc//88v9\nA0Lsv7aB/ZdmonqjjupEtVcNK0OfKq9QCmumGSSqklZRpFIVhoiq997rv5ZOmP3nt1oxLrOzxalU\nffRRfisRCSkrFFUkV+Qib2L/AcD5lfMAitH4Exi0//xy/wDVh2qyMjls/7UM7L8UlSrvXDVZQdnu\ntBPtLw+8ll6YqPKz/4Dkocrr6+E9qoC+/SeVqjiiSqpXZbH//HL/BDYAJcQMiiqSK16LKggRXSKq\nimL/6fl/QZ3eK5XKwJwwwcj+S1Gp8q6qFIGVtO9VHnirT3naf3EqVUH2X9iKRRFaZbH//HL/hDSr\nLAkZJyiqSK5I9cbU/utVqgpi/+n5f365f4Lfyjsj+29yJnnzz+Zg/6+eDVnQyeqtlrKTbNh/ly/H\nz/+LM6fKlf2XV/PPoMafAitVhJhBUUVypez2H9DP/wuLz9E7wgtG9l+a1X+NYfsPGJ7bVRT8JnIn\ntf/a7fj5fyaVKt3+8+b+yXiB0bD/TEQVJ6sTEg5FFcmV2PbftfMAilOpAvoNQMOCnpPaf7PV2VQd\n1f3sv6KuAPSr5CS1/4D4VRSTOVXyuF/jT2C07L+glX8A7T9CTKGoIrmSxP4rSu6fIPl/frl/Qm1q\nuEeUqf2XpqVCmew/P1GV1P4D4l/w49h/+uvo5GH/FaFSxfw/QsygqCK5Etf+u7ByoVBVKqCf/+eX\n+ycktv9SdlQvk/0n1R0T+299XXU1D7L/gPjWVBz7T38dnWpVbVNm+y8s90+Q/D/af4SEQ1FFciWu\n/XejdaMwK/8Esf8urg5H1AipVv+laKkwqvZfmEDJw/7TX8dL0JjLYv9F5f4JSVdZEjJOUFSRXIlr\n/wHFmqQO9PP/3rryVmAVLZX9Z6mlwijZf2FWWtL8vzgtFYDgSk6SMZsyO6sqSe0MW41FNf4UGFVD\nSDQUVSRX4tp/QLEmqQN9kffmpTcDBR9X/0UTZP81Gv2w47BthaT5fyZzqqLsPyBYVIWN2RRZbdho\nJN9HFHFEFe0/QsKhqCK5Ymr/bahu6P1cRPsPUL/L4lyA/TeV0P6bnEGj1UjUBd3bp6qs9h8wbKdF\nVX2SWFNxK1VJ7L/Z2eCGmiaIqMrSAozK/RNo/xESjem/+y4ARwDsy3AsZAyoN+qooIKZyZnQ7SqV\nSk8UFK5SpY0ntFKVxP6rquPSaMUrTTRbTay310fC/tMf824bVPVJYk2ZzKmqVIDJyf5r+BFm/6Wx\n/oC+qMqyAWicShXz/wgJx0RU7QPwOIAHAGzOdjhk1JFqSqVSidxWBELh5lRp4wmcUzVdQ7PdRLPV\n7N1nuvoPQOx5VWLx6faf/Fw2+w8YFilRK+mSWFMmlSqgbwEmsf9siaosK1VRuX8Cu6oTEo2JqDoJ\n4D9kPRAyHngnU4chlZai2X+6kApc/df9HXXrrdFqRFaqZquzABB7XpVUo3T7b6IygdnqLO2/AEzm\nVAH9alYS+y/NfCogP/svLPdPYANQQqIxtf+iywqEGFBv1iNX/glFtf8k/w8It/+AwSpRs900aqkA\nxK9U9eaqeQSr3yrEoiBCaUN/+lwq+y9u/p9ppUpElUv7L+tKVZT1B7BSRYgJnKhOcqXeqBt3Ry+q\n/Qf0hV6Y/QcMzmeKZf/FrVT52H8yjiLbf3Nzat6SkMb+i5v/ZzKnClD2n1/unzAK9l8cUcUVgIQE\nQ1FFciWJ/Ve0ShXQt/1s239JK1V+9h+gRFaR7T+v6JBKVFClKsz+A+JVUeJUqsJWxs3NBVeqymL/\nRa38A2j/EWJCpqJqvb0evREZK+Laf0XL/ROkeuaX+wcM23+tdgsddIxaKgDxK1Wh9l9BK1V+okpu\nB82pCrP/gHgX/DhzqsIqObWaGm+nM3i/DftvVk2xK0Slivl/hESTmaj6tf/n17Dvv5p1YDjy6hF8\n8VtfTPV6/+pfAf/u3yV//mpzFXf/8d34q3N/ZbT9F7/1RRx97WjyFywwH974EHf+33fi++9832j7\nz30O+Na3zPYdx/7bNLMJN9duNttxzlw6dwuwuhUzU1OYmEDva2oK+NM/Hbb/mm21CjDK/hMxds/R\nezDxf00MfH3lua8EPi/U/guZU/U3fwPceSdw7VrELxyTRgP4zGeAP/uz4G3E/tMJs/8qlb7I8JLE\nmjKtVG3YANwcchrWasp69AqfPOy/3/kdDJx/Sb6uXg3//YSi5P/99V8Dn/408PHHyffx9a8DXwn+\ndyIkMQZvKfFYWVnB4cOH8ZM3foLl+jKWdixh7969oc/57k++i5PnTqLdaWOikkznnTwJ/PSniZ4K\nAHj72tv44fIP8f13vo9f3fmrodu2O22cPHcSt9x0Cw7dcyj5ixaUM1fO4MdXfoy/fedv8dlbPxu6\n7eoq8N3vAnffrYRtFJevX8bP3/zzRuP4g8//AS7WizmBY+ubj2DTG/fj//g/B+9/4gnglVeAT+8d\ntP+k71SU/fcrP/MreHz/4/i4MXjF+M6PvoO/+cnfBD4vzP67vHo58Hl/+7fAj38MvPUWsGdP6NBi\n8f77wA9+oETbb/yG/zZx7b9abXD+lU5S+89kTtUf/7Hq2B6EPmZd9OVh/33ve8BddwG/+ZvJX2Ny\nEvjt3zbbtggNQL//feAf/gE4dw74ebO3kiFeegk4dcruuEh5WFpawtLSEgDg/PnzADBva9/WRdX8\n/DwOHz6MD/6/D/Dsm89GCioAWK4vo91p48r1K4F2ShidjvpHT/PPvlxXT15ejd7JletX0O60jbYt\nI3GOhRxz02O/XF82niN118JduGvhLrMd58zae3fgH2+4Y6g6+q1vqWPhtf+kX1WU/bdhagMe/uWH\nh+6vN+p46rWnAp8XZv+93Xw78Hlx/36mmOw3rv0XJlCS5P+Z2n+f+1z44/qYZRxAPqv/lpeBgwfT\nVenjUIT8Pxvn7PJyf7WoNHcl48PevXt72mRpaQkXLlxYsbXvuGUh49YKC7UFXF69jFY7eo2zVCOS\nViVWVtSnzjRl6ThjSDveohPrWFwc/B5GvVHH9fXrhes7lYSgyb2Li+qxpPZfEIu1Raw2VwOtvKT2\nX5y/XxxM9utn/83MKEvKz/4LEyhJ8v9M7b8o/CzLTseu/efXxbzVUsLAZJK5LYqQ/2fjnL14Mf5q\nUUJMMO2o/iiADoAvwTCqZrG2iA46uHw92HoQpCIiFZK4yCeWS5eGJ4sa72M1RnWmnm68RSfWsYjx\nqVFEWhFX88UlaHKvfJL3hhmb2n9ByMT4oL9JoP1XDV/9V7RKVaXiv5rOpOoT15oytf+i8LMsr18f\nfCwpYZWqy5fV+53JJHNbFMH+S3vOirORZh+EBGHaUf1LACYBfLl7OxK5cEYJj06nE8tu8kP+MdbX\nVdUq0T5iCCVddHSSqrgCE+tYxHhzkuNWxL5TcWi3lYAPE1XeMGNT+y+IqP+n1eYqqhPVoUpYVJ+q\nookqoL+azrttlECJa03ZrlTpY45qAWFKmKgyzeyzSRHy/9Kesx9/3D+eFFXENpmt/pMLZ5SF9FHj\no97y8aR2ml4GTloSTmL/ra2vDU0oHgWS2H9Xrqg5KmGIICi7/Xf1qrJewuy/qYlpTFYmrdp/QPDf\nJKhVRW2qhtXmKtqdtu/zimb/Af7NNE2stLjWlOmcqij87L+oZqWmhIkq+V3ztv8At2Ik7Tlr43pB\nSBCZiSq5CERVn/RP3mntP+/PsfYRo/o0MOYRnKyexP4DlB0RxqjYf2EVgoUF1U7g448rA1WiPOw/\nv6aqIrTW1v1LC1lXqlZWgsV2UKUqT/vPhqjys/+i+mqZEtanykWlqggNQNOeszauF4QEkV2lytD+\n0y8Sae0/78+x9tF97UargY8aHxltC4zmvCr5/VbWVnpiIHDbGMd+VOy/KFEl2+jdzDO3/9ZX/StV\nPnE5QqPR70+VlagClFXqpdVSFlJQpSqp/Wea/9duqy8bc6qytP9kfEURVa4rVfV6/zhTVJEikpmo\n2jan1hZHWUj640Ww/0zGYWPMRUb/nS6t+lwR9W1jHPvl+jI2VDcYx9QUlTDbRe67eHGwm3la+++m\n6ZswW50Ntv8Cmqp6J8zr6BeUrOy/oH3LRO6gOVVJ7T/TFV0ivIpu/1UqygIMsv8qlcE2DlnjOv/P\nxjlL+49kSWaiqjpRxbYN24ztv9s23ZaqUnXrrf2fE+2jvozbNt02MKbAbVe1bUfR/otzLJaB227r\n/xzGxdWLWKgtoBLUwbEkmFaq9HYGae2/SqWChbmFYPuvGW7/+a0AlN/jttuyqVSFnRdhlZw09l/Q\n63kRS7Lo9h8QLKqWl4GtW+38Dqa4tv9snLNZnveEZJr9t1ALvggI8vhdC3elmlN1660qmyrJP0m7\n08al1Uu4e/HugTEFvl59udeUctTsv7X1NXzU+Mj8WCyrburyc+i2MRp/Fhn5Pbf79KnNyv4Dwv+f\nVpvx7T/5Pe6+2/6KrqjzIkx0pLH/gl7Py3o3lrToq/+AcFGVp/UHuM//08/ZK1f6f8e4+5ibA3bs\noKgi9slWVM0tGFlptakabt98eyr7b2EheWO6q9evotVp4e6Fu3tjCn29+kXsmN+Buam5kbP/RCQa\nH4uLwM/+rGrYGGn/rS6XfuUfoH7P+Xlg2sfJ0+0Rm/YfoBZ/2LT/5O9lKopNuX5dLVuX/fqdF2H2\nmNf+W19X879M7L+g1/MiF2Mbc6qqVXUuZGH/AUpU+QneoAa0WeI6/897zvrN1zPZR5rrBSFhZF+p\nMrDSFmoLqgP79cuBS79D97Hc/ydJcmGQT/8/t/3n1O2QMbc7bVy+fhkLcwuhdkxZ0SuHQPixWF1V\nXzffrOZ1RNp/9Yuln6QOhFcI5ubUhdS2/QeoDylBf4809t9ddw3eTovs5847ldhOa/+ZCpQ41pTN\nShUwPOa87L+8K1WA2wagNs7ZtNcLQsLIVFQtzi0aWWmLtUUs1hZ7+X9x6HTUp5XFxeT/7HKhun3+\ndtw0fVNodUZy/2TMIyequsfizm13YrIyGfr7ybE2PfajZP+FXcz0BqBW7b8QEZ/U/qtWgU99qn/b\nBrKfW25RVY209p+pQImT/2dzThUQPOYs7T+puOSNSzGyvKzaTOzc2b+dZB/ynmW6WpQQUzKvVEXl\n/12sX+xVfYD4c5SuXVNvkGnKuXr/pKjqU2/bbnVt1Ow/+X1urt2M7XPbQ38/fcJ21LGX3L9REFVR\ntksv/y8D+y8o/y+oT1WU/bd9u6o0ym0byH7Czoso+6/R6FeTTCtVcfL/bFeqvJaljDmrSlWrpeYU\n5W3/AW5tMxGSaSbM65Uq5v8R22Q+pyoq/0+3/4D4LQq8b+BJ8v/0/klRk+tF9PUE2IhNVI9zLLzH\nPuwNTv6uozCnyrRSVZuybP+FNACtN/3nVEXZf/K3k9s28IrtJPafvk2cqo9ptdrmnCrA3/6bnVX2\nZ1pmZ4dFlYvcP8G1/aefs3HFXaczOKdK9kmILbK1/6SreoDwkNy/xblF4w7sXrwWVJL8Pxnf9rnt\noZOB9fHp9t8o5f8t15cxNTGFzTOb1e8XIhrj2H+j0vgzLPdPyNL+A4b/n5qtJtbb64nsv8VFNem+\nWrUvqsLOiyj7D+hXe+LMTzK1pvKw/2xYf4B/pUr/QJM3LvP/5Jzdtk1Nmo97zkrun5ybsk9CbJG5\n/QcEV58k92+hltz+834qBuJ/erlYv4j52XlMT05HVp8G7L+5hZHL/5PJ5NIXKY79F5b/p1f4ykxY\n7p8g9t/cVA3NdhPNVtOa/QcM/z+JtRc2UT3I/ltYUBcnm5bOxYtqNdzGjcntP6AvpuKspDP9PfKw\n/7IUVbpwzRuXFR45ZycnzRbHeLFxvSAkjMztPyC4+qRfaLfPqaY/ae0/IME/2mp/ArXMqQqqPsmY\nt23YFpnHVkb8jkUQ3osnEJz/Nyr2n0k0iOT/VTv9+UxZ2n9ShfKz/yYqE5itzobafzJmm5UqXaz5\n5f+Ng/1nYz4VEC6qXNl/+hjyxHvOxv4AbeF6QUgYTu0/3UqbmpzCltktie2/NJMX9f5Ji7VFNFoN\nfHjjw8Btt27YiqnJqcjfr4x4j0VY/p+U4iuV6GM/KvafqagCgNb1/nymLO0/EUx+9h8wOLdLkNw/\n+bvZnCcj54XsFxjuJyQCRAKDB8brsf/iTPo2zf/LolKl239ZV6pc239A/mJEcv/SnLN6hU+a91JU\nEZtkKqqi8v90K02+JxFVGzeqN+c09p8+BiC4+iSrFYH+RW6UVgD6HYug/D/vp0Yg+Ngv15cxW50d\n6dw/QR5rrvbnM9mw/4Ly/8LsP6DbL8tj/3nFoW37L+q8WF1VIslvIrfX/otTqTJd0ZXFnCpvpcqm\nqPLOX1pezj/3T3Blm/mds2nsv2pVxfzQ/iM2yVRUVSeq2Lphq5H9J9+T2H/yT5b0k4fePylqbpes\nVgSiBVgZiXMs/C6eQcf+4upFLNYWRzr3T5DHGnW79l9Q/l+Y/QcMTpgXvL+H7UpV1HkRJjrS2n9+\nr+clj+afWdt/eef+Ca7sPxsfBLwVPjYAJbbJVFQBCG2Q6bWEolab+e5DsxpmZuLn/0nun255AcHV\nJ2lWCkSLjrIhuX/eYxH49/OxeQLtvxFq/An45/4J8oa99tGg/VdBBZMTk6le36+aa2T/BVSq5O9m\nc0WXyXkRJjrS2n9+r+fF9pwqsf9kKmYe9p8L6w9wl//nPWcXF+Pn/0nun/xtXLaHIKNJ5qIqrPok\nubPoa2UAAB51SURBVH9yMUgS++LtGRT304vk/vWqMzHsv9p0baTy/4YqhxGrN/Vjv3VreP6fXuEr\nM2G5f4Ick9WVQfsvjfUn+LX8MLL/PHOq/D6xA+kvMJL7Z2L/BYkOP/uvUvGff+VFLrhR7wFZzKlq\nt/viJ+uWCrpwzRvbq0VNCTpngxbH+JH2ekFIFNmLqpD8P++FdqG2gEurl2Ll/3k/scUt53qrZWHV\nJz33r/d6I5T/F+dYrK6qC4cc+4mJ8CXOeoWvzJjkrckn4frVQfsvzSR1wa/lhy37T78/Kd79ithO\na//VaupiHoWpOLQ9p8pvzLbsP7/mn65y/wQXtpmf/QfEE0VprxeERJG9/ReS/+e90MbN/9Nz/3r7\niFnOlQuUjGPD1IbA/D89908f88iIKs+x2LJhS2D+n1+fnLBjr1f4yozpxWxhAfjwyqD9l2Y+VW+/\nPiI+qf1Xraqqm4xX7k+D97yYmPDP/4tr/5kKFNP8vywqVcDgmG1WqppNVQkTXNp/gDtRNTsL3HST\nup3kg4C3wsf8P2KbXCpVQfl/3gtt3DlKeu5fbx8xy7l67p8+Dj8h4V2tKD+Piv3nPRYTlYnA/D+/\nCdtBx36ccv+ExUXg2nI29p83/y+p/bd9e3/1na0VXX5L/f3OizDRMTOjxuWtVJkwNWW2oiuLOVWA\nGmunY19UAaoNBuA290+QBrd5ojerBZJ9EPCz/5j/R2ySy5yqoPw/P/sPMG9REPQGHif/z69/UlBr\nB7+u4KOU/xfnWAQde783OL0fWdmJU6m6ejED+89nzl+k/Vf1t//03yMr+09+jmP/VSqDq+nizk8y\nqaJksfoPUGO9fl29/9hc/Qf0LUCXuX+Cq0qV97wCzMWdnvvn3QctQGKLXFb/AcPVJz33b2hbQzst\nyIKKk/+n5/7p4/CtzviIg1HK/9Nz/4SgFZlx7D+/Cl8ZMcn9ExYWgCvva/Zf2579Bwz+P602VzFZ\nmQzcf1CfKv1vZ2tFl+l5ETXnSG+mGcf+A8wu+Fn0qQLUWOPE6pjgFVUuG38KLvL/vOds3Pw/PfdP\nYP4fsU0u9h8wXH3Sc/9628a0/4I+FQPmn1703D99HH5j8LX/Rij/T8/9E4JWbwYde7/8v3HK/RMW\nF4GL709jsjKp7L+WPfsPGPx/qjfrqE3XAnuA1aZqWG2uDiwA8X5ir1SUHWjD/pPoIiGu/QcMNtOM\nW6kysaaymlNVr8frq2WCiCoRMC5z/wQXYsR7zsbN/7NxvSAkilzsP2C4+uR3oY2b/xdkQQEx/tFW\nh/snBeX/6bl/vW1HqAFo2LHwEnTxBIaXOI+K/Rcnb21hAWg2Kr1J4lnbf2Gd6mUC+9p6v6zgZ2Pa\n6Nmj5/71xrwwnP8XJZTysv9sZv8Bg6IqK/vPZe6f4MI28ztn48yhtXG9ICQKZ/af34U2bv6f35tL\n3E9QetadPma//D8990/fFhiNBqBBx8Iv/0/P/ettG3DsR8X+iyuqAGBmYi57+299NXDlH9CfayVz\nr7y5f/qYbYgq7369+X+tlqq6ZGn/Ra3oov2XjrzFiDf3T4jzQcCvwsf8P2KbzEVVUP5f0IU2Tv6f\nnvvXe34C+89vDMBw9cmvLcAo5f+FHQtv/l/Qp0Zg+NiPU+6fINtMV1Slypb955f/V2/UAyepA/1V\ngTKvKkgc2miE6LfU33teXL/eHVeG9l/Uiq4y2n96pcpV7p9g2mTVFmHnbBr7j/l/xDaZi6qg/L+g\neTZx8v/83sDjfvLwi08Jmtvl1xV8pOy/GMci7OLpPfZSARuH3D9BtpnqqHYGtuw/v/y/etPM/pMV\ngEG/h037T8d7XpiIjrT2n/56fpTJ/pMPjbqocpX7J+RdqbLxQSCowscGoMQmmYsqwL9Bpt/y/d62\nMSaqe6sGcfL/vLl/+hiA4eqTX1fwUcn/8+b+CUErMsNsHj/7r+yT1IH+7xWW+yf0Os237dp/wHA1\nd7UZz/4LmuhsY0WXyXlhIjrS2n/66/mRZfPPPOw/l9YfkH/+X9A5Gyf/z5v7p++DoorYIhdR5Vd9\n8ub+6dvGsf/83lxMP714c/96z49h/41K/l9g5TBg9abfsQ/K/xun3D9Bjk2ladf+A4ZbfsS1/8I+\nsQPJLzDe3D/vfuV1TUSH2H+tlhITce0//fX8sD2nqlpV50Ve9p/LlX9A/vl/UeesSf5f2usFISbk\nI6p88v+CLrRx8v+CPrGZlnODqmV+86T8cv/07ctu/0Udi4GJ0Z7cPyEo/2+ccv8E+UTcadi1/4Dh\nlh827T/98bgE7deb/xfH/ksiUFxUqoDhMWe1+q8IlSogX9sszP4DzERR2usFISbkY//55P8FXWhN\n8//8cv96+zAs53qz7gTJ/9MvXH65f/qYSy+qAo6FX/5fWJ8cv2M/SvZfnIvZwgLQWsvA/ptLb//p\nuX/6eOXxJASdF978vzj2n1S14ggUk/w/EVWTk+b7jcI75iwrVeMoqvTcPyHOB4GgCh/z/4hNcqtU\nefP/gi60pnOU/HL/evswLOf65f7p49AvXGFtAUYh/y/oWPjl/4VN2PYe+3HM/RMWF4H11WzsPz3/\nL6pPlZ/9p+f+CWkbIYYt9dcvwKb2n7R+iNrWi0n+3/q6EpY2106IZZlVpWptTY3bde6fkGf+nzf3\nT4jzQSDM/mP+H7FFbnOqvPl/YfYfEN2iIOoN3CT/L8jykvsGhERIV/BRyP+LOhYDAtPw4hm137KR\npFLV+DgD+88z5086qgfhZ//5/R5Z2X9yn5w3pvafvs+4VZ+oKkqzaX/1nG7/zc4Oi9ak6JWqIuT+\nCXlXqoLOKyBa3Pnl/nn3QQuQ2CC31X9AX5j45f4NbRthp0VZUCb5f365f/o4BiyvkK7go5D/55f7\nJ3hXZMax/0SYln1OVZzcP2FhAVj7KBv7D1B/s2arifX2emz7z+9vl3ZFl+l5YWr/6fuMW/WJuuBL\npcomuv1ny/oDBkVVEbqpC3nm/wWds6b5f365fwLz/4hNcrP/gP4F1i/3r7etof0X9akYiP704pf7\np49DH0Oo/TcC+X9+uX+Cd/Vm1LHX8//GMfdPWFwEVq/V0Gw3sdpctWr/AUroi6VnMlFdt//8/nZp\n8//8oouEJPYfkLxSFWVNra/b61El6PafTVGl96kqQu6fkKcYCTpnTfP/bFwvCDEhN/sP6Fd7wi60\nUjUyrVSlKef6Zd3pY75Yv9irPvnl/vW2HYEGoFHHwjtRPeziCfSXOI+K/ZekQrCwALSuq6vrytqK\n1T5VgBLCUn0Ks/8mKhOYrc5G2n9Aup49frl/vTEvKGHabMaz/+RCZ9v+y6JSpdt/tuZTAYOVqiJE\n1Ah52mZh56yJDWnjekGICU7svzArTfL/0sypMv0E5Zd1p4+52W728v/8cv/0bYHo6lqRiToWev6f\nTNj2u3h6j/2o2H9JRRWa6ura7rSttlQA1PkmQinM/gNUJaveqAfm/uljTiOqgvar5/+JqNLjpYbG\na8H+C1vRlcWcqqzsP6moFdH+A7IXI0G5f4LJhPmwiCnm/xGb5CKqvPl/UQG7Jvl/frl/vefHsP/C\nxgD0BWBYW4BRyP8zORaS/xf1qRHoH/txzP0TFhcBNPu/ty37T/L/TO0/QFWy6s165EU5zYqusP5J\n+gVYOqSHTeS2Yf+Fregqk/1XqahqlYgq17l/Ql75f1HnbNpKFfP/iE1yEVXe/L+oeTYmq+nCLuym\nnzz8su70MQB9oRTWFXwk7D+DY9GrNBqIKjn2YiuOU+6fsLAAoNG/utqy/yT/z9T+A1Qla7W5auUC\nFYSp2DYRHTbsPxmTH2Wy/4C+qLp40X3un5BXpSprUWW6D0JMyEVUAYOr6aLm2XhjOPwI6xlkkv8X\nlPunjwHQhERIV/Cy5/8F5f4J3hWZYcfez/4ru/UH9H8fk9w/Qbf/AFiz/4B+NTeW/adVqsLsv6Qr\nukzsP71SFTpeC/af/nwvWdp/titVwGClqgjWH5Bf/l/UOSvNO8Py/y5e9M/90/dBUUVskJuo0leQ\nBeX+6dua2H9hby5RDUCDcv96z49h/5U9/y+ycuhZvRl27L35f+OY+ycoUWXf/gP6bS5i2X+NeuRE\n56QruoJy/wRd5JiIDl1UVSrh86/8iLKmsmqp0G6rCflZiqoirPwD8sv/izpnTfL/0l4vCDElP1Gl\n5f9FXWhN8v9M/knCLgxR1TLd/gvL/dO3L6v9Z3osluvLgbl/gjf/L8xWLBNJKgRzc8DspH37D0Cm\n9h8QX1RF7VcX23Hsv+VltW1c99jE/rM9p0ofcxb239pacXL/hDxsMxvnbNrrBSGm5Gf/afl/UQG7\nUfl/nU70J7aocm5Q1p2g5/+F5f7pYy6tqIo4Fnr+n0mfHP3Yj5L9l+Ritm1jRvbfXHL7zy/3r7ff\nlKIq6LzQ8//i2H/r68kESlT+X1aVKtn3ONh/QH6iyi/3TzCZMB8VMcX8P2KLXCtVkv8XFbAbNUcp\nLPevt4+Icm5Y7p8+juXV5cjVivJYWe2/qGOh5/+ZTNiWYz/OuX/Cwnx29t9qc7Un5OPYf365f739\nJlzRZdI/SS7AJpWqmZn+GJMIlKj8v6zmVPn9bIPZWSVGi5L7J+SR/xeU+yfYqlQx/4/YINc5VZL/\nZ2L/AcGr6Uwv7GH5fyZNKUUomXQFL3P+n+mx0CtVJhfPUWn8CSSvECxqosqq/dc9pudXzgMwsP+q\nffsv6m8H2Lf/5DFT+69S6VeokgqUsCpKVqv//H62wcwM8N57xcn9E/KqVKU5Z8XZyOK8J8RLrqv/\nAPREil/un9+2fpj0DIrK/wvL/dPHsby6HNqs1LttGfP/wnL/BJkYbXrsl5dHp/Fnktw/4eat2dl/\ngBJVk5XJSMGm96kK+9slXdEVxxY2sf+AvphKKlCiRFUWfar8frbBzAzwzjvq56KJqqzz/6LOWcn/\nC6qYffyxGl/UuSmvRUgacrX/AODs1bOBuX+9bSPsP9NPxUDwP1pY7p8+juW6of1X4vy/sNw/QSZG\nmx77K1eAn34YXeErA0ly/4RbFqeB9iQA+/YfoERVbboW2QesNlXDanMVH1xsh/7tkq7oCsv9E+LY\nf0B/m6QCJcyaynJOlfdnG8zM9Fe3Fc3+A7IVI1GT86Py/2xcLwgxJVf7DwB+uPzDgdt+ROX/xfkn\nCfxHC8m66+1jbtD+88v9621b4gagpsdC7D+TiycAnL84GvZfmmiQxYVKrwFoFvbfhWsXjLrVy0T2\n5atrkb9HEksnLPdP3+/Vq8CHH5qJjiztvyzmVGVt/wlFq1QB2YoqE+s97G9t43pBiCm5239vLL8x\ncNuPqPw/k0mxUZ+gwrLu9DE3202cuXomMPdP3xYoZwNQ02OxsraCn15sBOb+9bbt7ur8pdGw/9KI\nKr0BaBb2X6PViFz5B/TnXH14vR5Z6UgqqqL2K483m/nZf0EruspYqRLGSVRF5f4JYVVJkykLzP8j\ntshNVEn+3xsXlaiKql6E5f+F5f71nm9g/5mMQcZsUsmR/ZaNOMfivZVLRp8aAeDdq+Ob+yfo+X82\n7T/J/wOiJ6kD2urA6Xrk3y/Jii6T/kn643nZf0Eruso4pwooTu6fkHX+n+kHmrSVKub/EVvkJqok\n/+/NS28CiJ5nE7aazqQcHPXJw6QppTz+5qU3jUVHKe2/GMfig4+WjS+eH3w0vrl/gp7/Z9P+k/w/\nILqdAqD1sZpazdT+i9qvkJf9J2PzUsbVf0Bxcv+ErCtVeYmqqH0QYkpuogpQNtDaulomEiVSwvL/\nTHoGheX/ReX+6WMAVDZe1LZlzf+Lyv0T9Pw/U5vn0vXRafwJxMv9E7Ky/4D+/1Ac+w9TZvZf3BVd\ncew/ID/7T8bmpWx9qkRUFcn6A7LP/zNZVSqPB+X/ReX+6fugqCJpMRVVBwEcB/BQ9/u+JC+mf7KO\nuhCExb6Y9gwKWsUUlfvXe74m/KK2LWv+n0kPLqB/LFaaFyOPvUSSXG2Mb+6foOf/2bT/gL7QzcL+\nA8wvMFG5f4IL+w/wfw/Iwv6rVvvnSBbNP4FirfwDss//M5k/qz/ul/+X9npBSBxMRNVBAEcBfAnA\nkwAeAPAMEggrucCaXGjD8v/i/JP4XRhMm1LqQsOkLUAZ8//iHosbk9H2n+T/fdQa39w/YW4OmGzZ\nt/8AZGr/AeaiytReEbENjKb9B/THnJX9V7RKFZCtbRbHutO39+4jzfWCkDiYiKpjAI5ot69BVase\nj/ti0vDTxBIKyv8zyf3r7SOgnBuVdSdI/l+cMZdOVBkeC8n/Qy3a/gPUsb8+MTr2X5qL2exkRvbf\nXHz7b2K2Hpj719tvQlEVdV5I/h+Qj/0Xlv+Xhf0HpK+uBTHOoios908Iq0qaRkwx/4/YIEpU7Qew\nGcArnvtPAdgDYEecF+tVqgyrPsDwHCWT3L/ePgLKuSa5f95xmFbXymb/mR6LicoENk9vB2rR9h8A\nbL25jtbEeOf+CXNTGdt/BpUq2eamrfXA3L/efmOu6DK1aPRt8rD/wvL/sqpUpRWCQYioKpr9B2Sb\n/xeV+yfYqlQx/4+kJUpU7el+P+u5/6zn8R4rQbkwiC9QgOHVdHFWYgXl/8XJpIsrBMs2UT3Osdg4\nsQDMRdt/ALDxFrP9Li0tRe/MMWkrVVIlsm7/dY+tyZwqqWbdNL8avd+M7D99myT2X5JzJaiKksWc\nKkCNeXY2OLA6KeNcqYpzXunjWFpaMsr9C9sHGRsi6vfmRP3rf7L73auURMvv8j4hTFTJJ+uw3D/v\ntt7KT5yeQUH5fya5f95xxLH/ypT/Z5L7J8x1Fo3tv7kFs8afRRdVaXL/hI0zxbH/5jbXI7eNu6LL\n1P7Tt0li/9kWVVlVqmxbf0DxRVVW+X+mUz388v+WlpaMcv8E5v+NNdZEVdTbSrBCiqDRUBcknc1T\n6h1hfnoh8h9w46Ta9sKV97DycX/j8+8AqAKbtgJrPstnB15vm9r2zHngU9r17J1rP0Xt3RrazWms\nNcP3sXVmoTeeqDHPT6v8v3euXsLG6Y347l+/jF/53OfDnxRB0n2YPu+daz/Fxvc24caN8Pr6yy8v\nYaqxANRew8Yta0PH/uWXXsbn7+u/3uSW9wAAtcoCnn9+CZ///F7f/a6vZxvGmpYrV9Qci2vXlgDs\nTbSPzXPqKttqTAf+ri+/HHyMAvc7pa4C05Vab79B+5noKGUyvelar61J77U9fzsA2H4L8M77wIon\nytLvvPrJT4GpDepLzgu/fQLAlgUAVeB/vPYybrnV//yU505tUNtWZ9V+19vrQ2OPYtvNwN///fDv\n0WirfUe9h8Rl9iZgYuplrK2n+7/3MjENoArMb1djDjq+cUi6D+/z5G/61gXgE58Y3NbkfShsm/cv\nAZu2BB9PfSxbFwfP2bXGOs6+vTZw3MLYtFX9Hud+AvxCjhGuNq4Teew/yX7iPMd026jtwh73PnbT\nBrtTMgAgqivjwwAeg7L5Xtfu3w/geaiVgd/W7l+amZm577Of/Sx+8IPhChGqq8AnXgEufRqo3xzx\n0h3g9pcNfoWEXKkCH/1y9Hbz54DNbwMXPo/Iw1X7ANj+9/3bK0ivf5PuI87zLk8BH/9SxEbngS3r\nwKZ3473eO/8z0HofwdPvzoc8Vhy2bz+Pu+/ekei5b7z9Li61zwBv/zLQCfoccx6xj8PkDeDW7wOX\nfxb4+Jbo/fzMXwOV4dW0sc4V022jtgt7POgxG/9PeZDHOMvy3mKybRbniulrF4Gsx2lr/0n2U/D3\nlp+bvwfvnT+Da9euXYDZG/BmqMV6gUSJqt0AXoMSUS9q9x8C8BSU/Xfe85wzGKyArSBFxStD5pH9\nuGy8RtJ9xHmeybZR24Q9nvSxIpH1OG3tP8l+bJ8rJtslOSd4rth9Db63FAO+t8Tb1sb5Mo9B6fWN\n7pcfV6HEFKAKTX9kMMZQrkBVq3SOAPhxwv0dSjccMgYchBLnV6DEOyFhHISqnPN8IabMQ73HEBLG\nIQCfsb3TAxg8+eah3ry+EHM/m6F6W/n4D4T02IO+iN8Ndb484G44pODsRL8R8Wao9ybrb4Jk5DgC\nwKf/OiEDnIH6wGb9GvQABmNqfjPhfnaCooqE4+3U/xSGK6WEBPEqgE2uB0EKzQGo9xl2pCJRHIC6\nBrWRQWVzF5S6D4qn0fMBn4KqMvjtg6Jq9LFxrgjPI7mIJ+UgzvkSlj16qLsNGV3Sniu7oIoEO0FR\nNQ7Yem/ZDCWqrL2/HIC6uLXhf4E7CHWCyidEOWG9F0uKqtHH1rkCKKv5LzMYIykO+6CyRKPOF0Es\nPu+bn37eJQp8J4XHxrkiVe9doKgadWy9twji2FlD5rf4De4tAP/Jc99xqDc5HYqq8cDGuQKoKhat\nnNEn7Hy5CuDrnvuegrL5/HgMlt/4SKFIc64cgvoQB1BUjQs231v2Y/jaNUScMIWg9gt7oE7UFzz3\nn+gOYkeM1yCjgY1zRbzsD20PjhSOoPMlSfboCXvDIgUk6bmyE0pUvYD+5ON5qFXsXNgwuqQ5X3Zi\nMDXmINRiu1BsBDXs73735gPqUTbnLbwOKT+m58pBqIrW693793RvhzZdIyOHSfboeQz2pdkNtlUY\nR6LOld0A7tXu3wzgHIBPZTwuUkxM3lu2Qs3HOgrVr/MpGGgZG6JqW/e7t9mW3N4F1Th0M4BHAHSg\nSnHfBhk3TM6VCobtm9cA/JMMx0WKiUn2qMybOA5VhTiFwUbFZDyIm1NbgboWkfEk6nzZCdXk81jc\nHdsQVVG9PqRr6TUAX+1+kfHE5Fw5iXi2NBldTLorn4T6REnGm7idvlfQ/5BHxo/MOtjbuHhJucyb\ntjPveZwQniskDiLCveeLiCieL0TguULikNn5YlNUecurMrhTFl6DjAY8V0gcZEGDtxIl5w/PFyLw\nXCFxyOx8sSGqTkFdLH/Rc/8/hZoLc97Ca5DRgOcKicNpqDL9Fz333wO1cOF83gMihYXnColDIc6X\nPVD9Hg74PHYAaoKXJDknzQckowHPFRKHqPPFRvYoGQ14rpA4FPZ82QfV16MF1eHab4D7oJYcSrt3\nnsjjCc8VEod9UOdAC8DT8O9mbCt7lJQbniskDjxfCCGEEEIIIYQQQgghhBBCCCGEEEIIIYQQQggh\nhBBCCCGEEEIIIYQQQgghhBBCCCGEEEIIIYSQUeQtAJ/p/vwwVNd7GzzU/TrT/U4IISNJ1fUACCGF\nYSeA17s/78JgLlZSnoGKK/omVIDp5vDNCSGEEELKjYSJCs/DTiZjG8zTIoSMCROuB0AIKQT3Ajjb\n/bkCVak6l3Kf89r+CCGEEEJGmkNQ1aSgr1dCnvswgMe6X89DJb4LB6BS39vdx56Cf0o8AOwGcATA\nq1AW5Avd5x0PGO9j2r5lu33d5z7e3c9uqDliz3eft6v7GpJGr88Xk9c/A2VPPgNVtXu1e3tPdz9X\nfMa0v7uvh7rbPxbwOxJCCCFkTHgKwO91f96J6PlUzwD4T9rtnVAC5zGf+0zsPxFSvwdgE5Qoa3e/\nC/sxaFEeB9Dqbi+v91Z3m9/rPvfr3cfe0n5Gd9+6CHxGe31Aiak2lFCS8e/2GVNbe/3NsDe5nxBC\nCCElpAIlHqSSdBD9Co8fe6DExGc89z/VvV8mo++Cuag6AuCy5742BoWQTHoX9vmM4xmf/QBKLOnb\nXfHs++HuvnReBfDjiDG1MSikdvq8NiFkTOCcKkLGm6tQ1Z49GLTd9nd//h2f59zb/b7iuf9E9/s9\nCcfiN/dqXvu5A+CT2u2r3e9nMcgVDPNHAM6j397Bu28/rvrct+J53lehLMkzUBWstPPQCCElhqKK\nkPFmS/cLUO8HEwCehRIKE1CtEILY5bntFVm2OdJ9Tamo3Qs1h+pDg+ceghKLRwA8aXFMx9AXkc+A\nc6oIGWsoqgghnwRwSru9C8r6CuKF7vcveu6fh6omhT03jE7E4yehKkNfhZoPdRnAvzXY7y4oi+4x\n9AWYrRWJuwGcBnAHgCegbMRNoc8ghIwsFFWEEL2dAqBEiNdS0zkH4ChU9Udv5nkISvB4K0emAsZv\nO/2+QwDuB/A0lDVXgbItvWzz3JaK2pe64z3Y/b4N/fF7nxM0pnntvnkAv689dhRKGJpUzgghhBAy\ngjyG/uTreag5ViY8BDWhXdoq6BPSd6O/Ou8vMbhizous6muhv/rucag5XT9Gf4L5fqi5S2e620tL\nBamMHdL283UMCj5pwfBKd2yPob9KcF93n/L6Iry8Y5LJ7DKm+e7tp6EqZ0/BTsNUQgghhJBMeQzA\nDs99O6HEDFfdEUIIIYQYIFUqPx4IuJ8QQnKFc6oIIWVgHmpu1EPotzTYhf6EdUIIIYQQYshDUNWq\ndvf70xi2AwkhhBBCCCGEEEIIIYQQQgghhBBCCCGEEEIIIYQQQgghhBBCCCGEEEIIIYTY4/8HV+o9\n2JY0PvgAAAAASUVORK5CYII=\n", | |
"text": [ | |
"<matplotlib.figure.Figure at 0x113052e10>" | |
] | |
} | |
], | |
"prompt_number": 143 | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": {}, | |
"source": [ | |
"If you want to apply the filter to the RDDs that are generated automatically, you can use the ``apply_filter`` method. This uses the default ``nmin`` and ``nmax`` that you initialize the ``SparkBloombergCaseVectorizer`` with, or you can specify a different set at this point. By default, this filters on the number times an ngram occurs in unique documents. If we want to change this, we can pass in a different function, e.g. to do the filtering by judge usage: " | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"collapsed": false, | |
"input": [ | |
"cv.apply_filter(nmin = 5, nmax = 10)" | |
], | |
"language": "python", | |
"metadata": {}, | |
"outputs": [], | |
"prompt_number": 144 | |
}, | |
{ | |
"cell_type": "code", | |
"collapsed": false, | |
"input": [ | |
"len(cv.ngram_rdd.first()[1]) # number of ngrams in the first document" | |
], | |
"language": "python", | |
"metadata": {}, | |
"outputs": [ | |
{ | |
"metadata": {}, | |
"output_type": "pyout", | |
"prompt_number": 145, | |
"text": [ | |
"127" | |
] | |
} | |
], | |
"prompt_number": 145 | |
}, | |
{ | |
"cell_type": "code", | |
"collapsed": false, | |
"input": [ | |
"bins, hist_filtered2 = histogram_rdd(cv.ngram_rdd, 100, 1, 5)\n", | |
"plt.plot(bins,hist_filtered2)\n", | |
"plt.xlabel('\\# of ngrams')\n", | |
"plt.semilogx()" | |
], | |
"language": "python", | |
"metadata": {}, | |
"outputs": [ | |
{ | |
"metadata": {}, | |
"output_type": "pyout", | |
"prompt_number": 146, | |
"text": [ | |
"[]" | |
] | |
}, | |
{ | |
"metadata": {}, | |
"output_type": "display_data", | |
"png": "iVBORw0KGgoAAAANSUhEUgAAAmMAAAGOCAYAAADb+gS8AAAABHNCSVQICAgIfAhkiAAAAAlwSFlz\nAAALEgAACxIB0t1+/AAAIABJREFUeJzt3WtsHOd97/HfkqIkOqS5pHNxrpboJkBboI1kp5cXbdgj\n2WiBIkWP5RRo0NNTIJZ70BenKGKn7pvoTRO76fWgQGU7aIuDHNSRYqNAL0Asyd2mQNvUtuz2RVHA\nulBAbo1tcmmp4koUuefFs493dnYuz8zOnd8PQCx3dubZWXvE+e3/eeYZCQAAAAAAAAAAAAAAAAAA\nAAAAAAAAAAAAAKiMJySddFz3mKRTkh4ZPB7Ja6cAAAB2g8OSdiT9icO6xySteZ4vDJ4TyAAAAFI6\nKekluYWxdUlfCNkeAAAAkqYSrHtS0mcltRzWPSpTCXvRt/y8THXtQIL3BQAAaCzXMHZM0vOSNhzX\nPzx4vORbfsn3OgAAwK7mEsbaMpWu5wbP+w7b3D147PqW2zFkyw5tAAAANJ5LGHtc0qMJ2/WHMAAA\nAATYE/P6cUmnJb3lWdZS/LixNwePbd/ypcGjv/tSkjoxbQIAAFTFSlYNuYSxoPFdhwavHZf0pYDX\nz8hU1JZ8y2335Hnf8s7+/fs//p73vOftBe12W+22P8uVr9vt5r5fWbxH2jaSbOeybrfb1Xe+01a3\nK/34jydrI+1rVZL3fmbVfpp28jhWJlkn7DWOlWzfo0p/W6LW4W9LNdpv0t+Wbrerbtd0/H33u9/V\njRs3Lkj6PqcdndCCzJWP9uegpAuSvjJ4vhCx7ZpMIPN6UtJrAet27rrrrn4dfO5zn6vFe6RtI8l2\nLut+7nOf6z/wQL//zncmbyPta1WS935m1X6advI4ViZZJ+w1jpVs36NKf1vSvs7fluLab+rflo9/\n/ON9SasT5KsR0zGv35AZ/+X9+WWZQPb/Bq9LZoD/GUn/IOm7g2UXJP1vSf9n8Lwt6SlJ/0PSZd/7\n/M92u33g13/911N/kCIdOHCgFu+Rto0k27ms+/WvH9Dly9JjjyVvI+y1TqejlZWV2PeugryPl6za\nT9NO1sfKpOsEvcaxkv17VOVvS9w6/G2pRvtN/Nvy53/+57py5cqGpD+MbdiBy5xhfi/JzB/2vzzL\njkr6mqT7JL3gWf7QYNmLkj4m6RkNr8r06tx1110fX11dTbE7qLojR6R//EdpczO7Nk+cOKETJ05k\n1yAai2MFSXC8wMXKyor+/u///ooymjc1bsxYkHsDlp1VcJXt6cFPrDr00SOdXk/a2sq2zbp8c0X5\nOFaQBMcLEshs5ogkM/DnijDWXJub0va2+ckKfzDhimMFSXC8IIHmhTE0V69nHrOujgEA0ASEMeTO\njhUjjAEAMI4whtzZytjNm+XuBwAAVUQYQ+5sZYwwBgDAOMIYcseYMQAAwhHGkKudHenGYGpgKmMA\nAIwjjCFXNohJhDEAAIIQxpAr76z7dFMCADCOMIZcecMYlTEAAMYRxpArO3hfIowBABCEMIZcURkD\nACAaYQy58lbGGDMGAMA4whhyRWUMAIBohDHkijFjAABEI4whV0xtAQBANMIYckVlDACAaIQx5Iox\nYwAARCOMIVdcTQkAQDTCGHJFZQwAgGiEMeSKMWMAAEQjjCFXXE0JAEA0whhytbkp7dtnfqcyBgDA\nOMIYctXrSbfdJk1PE8YAAAhCGEOuNjel/fulvXsJYwAABCGMIVe9njQ7a8IYY8YAABhHGEOubGVs\nZobKGAAAQQhjyJW3MkYYAwBgXBlh7GAJ74mSeMeM0U0JAMA41zD2qKQ1STuSXpJ0yHG75cE23p8H\nEu4jasxWxuimBAAg2B6HdR6X9C+SDki6W9JpSeckLTls+6ik475lX0qwf6i5zU1paYluSgAAwsSF\nsQVJZ2TClyS9IukJSU9Kul3SWxHbtmUCG+FrF+NqSgAAosV1U25oGMSslqSXFR3EJOkxScdkujdP\nyr1rEw3C1ZQAAERLOoC/LemopCMO6/6LpKckvSnTVfmypIcSvh9qjqspAQCI5hrGFmTGf12SqXa5\ndD0+K+lXJX1Y0n2SujLdm1xNuYtwNSUAANFcw9iGpN+RqYidlQlkSapc5yTdM/j94QTboeaojAEA\nEC1pN+Urku6XqXIdTbjtZZkgt5BwO9TUzo504wZTWwAAEMVlaosgpyQtpthuQybIjel2uzpx4sTb\nz1dWVrSyspJm31ARvZ55pJsSAFB3nU5HnU5HkrS6uiqZcfSZSBvGlmWukEzqkKTfDnqh3W6PhDHU\nnw1jdFMCAOrOWyTqdDq6cuVKYHEpjbhuygWZQffeaSkOS1qX9Jxn2VFJFz3rHZaZn8x71eVxmQlj\nX51gf1Ejm5vmkaktAAAI51IZu0dmWoqvSnpR5orKXwhY74CGXZfrMldNnpEZJ3Ze0vMyU11gl6Ay\nBgBAvLgwtiHpXod2zkqa9jy/LOn70u4UmsFbGWPMGAAAwZJeTQk481bG6KYEACAYYQy58VfGCGMA\nAIwjjCE3/jFjdFMCADCOMIbc+K+m3N42PwAAYIgwhtz4K2MS1TEAAPwIY8iNf8yYRBgDAMCPMIbc\nBFXGGMQPAMAowhhy4x8zJhHGAADwI4whNzaMMWYMAIBwhDHkxnZT7ttHNyUAAGEIY8jN5qbpomy1\n6KYEACAMYQy56fVMF6VEZQwAgDCEMeTGVsYkxowBABCGMIbceCtjdFMCABCMMIbcBFXGCGMAAIwi\njCE3QWPG6KYEAGAUYQy58VbG6KYEACAYYQy54WpKAADiEcaQG66mBAAgHmEMueFqSgAA4hHGkBuu\npgQAIB5hDLnhakoAAOIRxpAbKmMAAMQjjCE3m5uMGQMAIA5hDLnY2THBi25KAACiEcaQi17PPDLp\nKwAA0QhjyIUNY7YyNjUlTU8TxgAA8COMIRebm+bRVsYk01VJGAMAYBRhDLnwV8YkE8YYMwYAwCjX\nMPaopDVJO5JeknTIcbtjkk5JemTweCTpDqKegipjMzNUxgAA8NvjsM7jkv5F0gFJd0s6LemcpKWY\n7Y5Jesqz3oKky5IeHGyPBgurjBHGAAAYFVcZW5B0RtJzkt6S9IqkJyS1Jd0es+3Tkp70PN+QqY49\nkWpPUSthY8bopgQAYFRcGNvQeBWrJellmXAW5qhMkHvRt/y8pMMyVTY0WFBljG5KAADGJR3A35YJ\nWnFjvw4PHi/5ll/yvY6G4mpKAADcuIwZk0yV62FJvykTyCTpkxHr3z147PqWrw0elx3fFzXF1ZQA\nALhxDWMbkn5HZvzYEzKD8x+SGRcWxB/CUGPf+pZ048b48g9+cDizvh+VMQAA3CTtpnxF0v0yYeto\nxHpvDh7bvuX2ykp/9yUq6mtfkz7wAenuu8d/fvmXw7ezYYwxYwAARHOtjPmdkrQY8foZmSkx/NNf\n2O7J8/4Nut2uTpw48fbzlZUVrayspNw9ZOWb3zSPf/AH0pLn/+bv/Z508WL4dv57U0pUxgAA9dXp\ndNTpdCRJq6ur0njBKbW0YWxZ0smI11+RqZ7dL+kFz/J7JF2UtOrfoN1uj4QxVIOtcH3qU9K73jVc\n/rd/K50fi9Tj2/nHjF27lv0+AgCQN2+RqNPp6MqVK5kNyXKZZ+xJjc64f1jSuszcY9ZRmZDlXe8h\nmbFlVltmwteH0+4sihc09kuSFhel9fXw7WxlbN++4TK6KQEAGOdSGbtHZl6xr8rMG3ZJ0i8ErHdA\no12Xz8p0U54abPcxSZ/WaKUMFRd0VaQ0DGP9vtRqjW+3uWkCnPc1uikBABgXF8Y2JN3r0M5ZSdMB\ny59W+BWXqIHNTWnPHvPjtbQkbW+bbsf5+fHter3xAMfUFgAAjEt6NSV2mV5vvItSMpUxKbyr0lbG\nvOimBABgHGEMkTY3xytcUnwYC6uMEcYAABhFGEOkLCtjdFMCADCOMIZIWVbG6KYEAGAcYQyRsq6M\nEcYAABhFGEOkuMrY2tr4axJXUwIA4IowhkhhlbH5eWl6OvnVlNvb5gcAABiEMUQKq4y1WlK7nfxq\nSonqGAAAXoQxRAqrjEnRt0QKGzMmEcYAAPAijCFSWGVMig9jYZUxBvEDADBEGEOktJWxoO1mZswj\nYQwAgCHCGCLlURmjmxIAgCHCGCIFjf2ywsLYzo6pfoWNGaMyBgDAEGEMofr94KsiLRvG+v3R5b2e\neQyagV8ijAEA4EUYQ6itLVPligpj29vStWujy8PCGJUxAADGEcYQyoaqsG7KpSXz6O+q3NwM3o4x\nYwAAjCOMIZQNVVGVMWk8jNFNCQCAO8IYQsVVxsLCWFxljDAGAMAQYQyhsq6M0U0JAMA4whhCZV0Z\no5sSAIBxhDGEcq2Mra2NLudqSgAA3BHGECquMjY/L01PczUlAACTIIwhVFxlrNWS2u3wMMbVlAAA\nxCOMIVRcZUwKviVS2HZ0UwIAMI4whlBxlTEpOIyFbUc3JQAA4whjCEVlDACA/BHGECrryhhjxgAA\nGEcYQ6iwKSq8oipj+/aNLqcyBgDAuDLC2MES3hMphE1R4WXDWL8/ut3+/eZqSy9bGWPMGAAAQ65h\n7FFJFyXtSHpe7oFqebCN9+eBhPuIkmxuSnv2mJ8wi4vS9rZ07dpwWa8XXE2bmjLzklEZAwBgKOI0\n+7YnJa1JekTSj8gEs5dlAtlGzLaPSjrueb4m6bnku4ky9HrRVTFp9JZI8/Pmd1sZC7J3L2EMAACv\nuDBmK1uPDZ4/J+lFSaclfVLS0xHbtiUtSfrShPuIkmxuRo8Xk0bD2Ic+ZH4Pq4xJJozRTQkAwFBc\nN+WCpM/6lj07eFyO2fYxScckXZB0UowVqx2XytjSknn0DuKPqozNzFAZAwDAKy6MvSLpLd+y9uDx\nTMy2FyR9VVJfpqvyohgvVitJK2NWXGWMMAYAwFCaqymPyowZeyFmvadlujI/LOk+SV2Z7k0qZDWR\ndMyYFTdmjG5KAACG0oSx35T0YMJtzkm6Z/D7sRTviRIkqYytrQ2XRVXG6KYEAGCUy9WUXidlrpBc\nTfFelyWdV8hYs263qxMnTrz9fGVlRSsrKyneBllxqYzNz5vpKvyVMTuWzI9uSgBAHXU6HXU6HUnS\n6uqqNBy2NbEkYey4zBxjcd2TUdZkxo6NabfbI2EM5dvclO64I3qdVktqt8fDGFdTAgCaxFsk6nQ6\nunLlSjertl27KY/JDMT3zxGWdPzXssygftSAS2VMGr8lUtR2VMYAABjlEsaOSnpYpqp1zPNzStKb\ng3UOy1S8Dnqen5Z0yNPOozLdnKuT7jSK4TJmTBoPY1HbMWYMAIBRcd2Uh2W6JvuSjvheO6PhtBdL\nMn2nC4Pn6zLB7GVJZ2XGij0j6dXJdxlFiRqI70VlDACA9OLC2Hm5Vc/OSvKOLros6d60O4VqiJqi\nwmtxUbp8eXS7qDFj3vtYAgCw26WZ2gK7RJrK2M6OqXwxAz8AAG4IYwjU7yerjK2vm216PbOMGfgB\nAHCTdJ4x7BJbW6bK5VoZ29423Y922gpm4AcAwA1hDIFshcu1MiaZ6tj0tPmdqykBAHBDNyUCbW6a\nR9fKmGTCGN2UAAAkQ2UMgSatjNFNCQCAGypjCJRXZYxuSgAARhHGEChJZczeFHx9fRjimPQVAAA3\nhDEESlMZW1uL345uSgAARhHGEChJZWx+3owV83ZTRk36ur1tfgAAAGEMIZJUxlotqd0e7aaMqoxJ\nVMcAALAIYwgUNxDfz87CH1cZI4wBADCKMIZAcQPx/WwYc62MMYgfAACDMIZAeVXGZmbMI2EMAACD\nMIZAVMYAACgGYQyBJq2M7dsXvB5jxgAAGEUYQ6BJKmP795srLIPQTQkAwCjCGAL1etKePebHxeKi\nmTvs9dejAxzdlAAAjCKMIZCtcLmys/B/5zvRXZt0UwIAMIowhkCbm+7jxaRhGPv2t6O3o5sSAIBR\nhDEE6vXSV8bopgQAwB1hDIHSVsbeeINuSgAAkiCMIVDaypgUvR3dlAAAjCKMIVDSytjS0vB3l8oY\nYQwAAIMwhkBJK2Pz89L0tPndZcwY3ZQAABiEMQRKWhlrtaR22/xOZQwAAHeEMQTq9ZKFMWk4bowx\nYwAAuCOMIVDSSV+lYRjjakoAANwRxhAor8oY3ZQAAIxyDWNPSFqTtCPpJUmHHLc7JumUpEcGj0eS\n7iDKkVdljG5KAABGudwG+kmZIHZM0j0ywexlSYuSNiK2OybpKUl20oMFSZclPSjpXMr9RUGojAEA\nUIy4ytiypNckPSbpBUlflAlTknQ8ZtunZYKctSFTHXsi+W6iSP1+/pUxxowBAGDEhbEFSb/rW/bs\n4HFJ4Y4Otn3Rt/y8pMOSDjjuH0pw65a0s5NPZWxqysxHRmUMAAAjLoy9EvGaP2h5HR48XvItv+R7\nHRW0uWke86iMSaarkjAGAICR5mrKY5IuSnouYp27B49d3/K1weNyivdFQWwYS1sZcwljdFMCAGC4\nDOD3+00Nx42F8YcwZ888Y07mP/dzaVvApHo985i2Mha33cxM8srYSy9J73639KEPJdvO6vfNsfX6\n6+Ov/cAPSEePxrfxX/8l/cM/SD/90+n2ISvf+Ib0gQ9I739/ufsBAMhG0jD2hKTPS3o1Zr03B49t\n33I7zszffalut6sTJ07oqaekd7xDWlhY0crKSsLdQxbSVsY+8hHp9tvNY5Q03ZQPPij91E9Jf/qn\nybazVlelX/zF4Nduv13aiLoueOAv/kJ66CHpW9+S3ve+dPuRhZ//efPf44/+qLx9AIDdptPpqNPp\nSJJWV1el8YyTWpIwdlzSNxTdPWmdkfS4xgf52+7J8/4N2u22Tpw4oRdekPbskchh5UlbGfvAB9xC\nTZpuyv/8T+l730u2jdfVq+bxz/5M+sQnhst///el3/5tEw7ttBtR+yCZ6lqZYWxjQ3rrrfLeHwB2\no5WVYZGo0+noypUrqXsB/VzHjB2T1Nd4EDsYsv4rMl2V9/uW3yMz3mw17I327x9WZlCOtJUxV0m7\nKW/cMPu0vp7+Pe1nete7pKWl4c9732uWu7Rt15lkPyZlpx3h3wgANIdLGDsq6WENJ361Pyc17I48\nKhOyvDPzPzRYz2rLjDV7OOrNZmc50ZTNVsbyCmNJuymzCEFhn2lpyb3tKoSxrS0TyOznAQDUX1w3\n5WFJz8tUxfy3MjojydtZckBmVn7rWZluylMy02B8TNKnZSaPDbV/PyeasqWd2sJV0m7KLEJQ2Gey\nFx3UJYzZz8EXFgBojrgwdl5u1bOzkqYDlj89+HFGZax8u6kyVrcwZj8HX1gAoDnSzDOWKypj5cu7\nMpZ0zJgNP3bsWBph4+DqFsaojAFA81QujFEZK19VK2P+35MIu0K0bmGMyhgANE/lwhiVsfJVdcyY\n//ckwipj7bZ7u1UIY1TGAKB5KhfGZmfNiXp7u+w92b3yroyl7aaUpLW18PWihFXGZmakubn4dre2\nhnOVpd2HLFAZA4DmqVwYsydLTjblKaIyliSMecNP1pUxyXRVxrXb9UztR2UMAJClyoUxe7LkZFOe\nXs/cBWFPmjuXOkjTTdlqDX9Po9czbczMjL/mEsbs660WY8YAANmqXBijMla+zc38qmJSum7KD35w\n+Hsam5sm6NtQ55UkjH3wg9WpjPX75e0HACA7lQtjVMbK1+vlN15MSnc15YEDk1Wler3wgJkkjN19\nt/m9rCBk/13s7CS/vycAoJoqF8aojJUv78pYmm7KO+6QFhYmr4wFSRLGlpfNxSXXrqXbj0l5/13w\nbwQAmqFyYYzKWPmigksW0nRTLi66haYwWVXGlpdHnxfN+++CfyMA0AyVDWN86y9PFbspJw1jcZWx\n69ej98m+78GDo8+LRmUMAJqncmHMVi/41l+eKnVT2lsgZRHGwj7T0pJ5jGp7fd2EuTvvjF83T1TG\nAKB5KhfGqIyVL+/K2MyMGXflMrGvDT1ZdFNGVca87xW2H3Yf4tbNE5UxAGieyoUxKmPlK6IyJrlV\nx7IKY1GfqU5hjMoYADRP5cIYA/jLV8SYMclt3BiVsVFUxgCgeSoXxpjaonxVrozZMWRJxQ3g975X\n2H4sLkrz89L0NJUxAEB2KhfGqIyVr4gxY1K6ypiU7kbdcVNbeN8rbD8WF83Es+12eTcLpzIGAM1T\nuTBGZax8RVXG0oaxNFWpqMpYux3frg1jdl/KrIzddtvwdwBA/VUujO3da6oPnGjKU9SYsSTdlO32\nZGEsqjI2MyPNzYVXu7a2pKtXqxHGer3hfvCFBQCaoXJhrNUyJ01ONOXo94u5UbjkXhmbmzPb5FUZ\nk6IDVrc7XCdu3bxtbg7nReMLCwA0Q+XCmGROmpxoynHrlrkJdZWupvSGILssiVu3zE9UwIwKWN6u\n0rh182YnwLW/AwDqr5JhjMpYeewJvipXU66tDStBLjPlB7HHUtrKmF3u3Y8yuyntGDf+jQBAM1Qy\njFEZK49LcJlU0m5KWwlaWDDd2GnDWNaVsX4/2X5kYXNz2G3LvxEAaIZKhjEqY+WxJ/gqdlNOTZlA\nljSMuXympGFse1u6di3ZfmTBXojAvxEAaI5KhjEqY+WpWjelN4xJ6cZr5VEZ8y4vkr0QgX8jANAc\nlQ1jfOsvR5W7KaV0YcylMra0JF2/HrxPVQpjVMYAoHkqGcb27+dbf1mKrIzFhTF766OiKmNScNvr\n6ybI7dsXv26e7LQjVMYAoFkqGcaojJWniMqYazelvyJlf89rzJj3Pf374d+HsHXztLVlAhmVMQBo\nlqLD2EGXlaiMladKlbGsw9gklbEqhDFvqKQyBgDN4RrGliU9KelIgraXJe34fh5w2ZDKWHmqNGYs\nKowlmVbCdZ4x73v696MKYczb3UplDACaY4/DOg9IeljSUUlfS9D2o5KOe56vSXrOZUMqY+WpQ2XM\njiWzN8yOk0U35Yc+NHw+Py9NT4ffyzIv/srYG28U+/4AgHy4hLFnJV2S9HKCdtuSliR9Kc1O0QVT\nnjqMGbOvuYaxLAbw//APD5+3WmYWfCpjAIAsuHZTthK2+5ikY5IuSDopx7FiFiea8hRRGZu0m9L7\nmguXypi9xVBQtWttbXQf7H4wZgwAkIW8BvBfkPRVSX2ZrsqLchwvJpkTzdaWmeUcxSqyMuYaxmxQ\nktKFMZfK2MyMuc2Qv92tLTPTftXCGF35ANAceYWxpyV9UtKHJd0nqSvptBJcTSlRHStDkZUxl25K\nex9GK6/KmG3b3263O/q+UevmzRsqucgFAJqjiKktzkm6Z/D7MZcN7EmTb/7F6/WkPXvMT16mpswA\neJfKWFAIsq+56vXMOC9vqAsSFLCCukrD1s0blTEAaKYcT7kjLks6LzPdRaBut6sTJ05Ikl5+WZJW\n1OutFLBr8NrczLcqZu3d6xbGlpZGl9nnSStjs7MmkEWJCmNB+1GFyli/H/+5AACT63Q66nQ6kqTV\n1VXJXKyYiaLCmGSmtrgY9mK73X47jH35y9Jf/zXf/MvQ6+U7Xszau9etm9JfkVpYMOEjaWXMJWAu\nLkoXLozvg33Nv66d76yoMOQfwL+zY/4b2jF4AID8rKysaGVlRZIJZleuXOlm1XaRM/Avywzqj8WY\nsfLYKlLeZmbiK2NBVzFOTZlAlqYyFidpN+X2thncXxT/1BbeZQCA+koaxoJqAIdlKl4HPc9PSzrk\nWedRmSkuVl3ehDFj5XGtIk3KtZvSH4Kk5OO1XD9TUNdjVBjzvl4Ef2XMuwwAUF8u3ZRHJH1Ww2kq\nJDMRrLUk02+6MHi+LhPMXpZ0Vmas2DOSXnXdKXui4Vt/8YqqjKXtppSSh7EklbHr101ItF1/LmHM\nOzt/nqiMAUAzuYSxc4OfMGcl3eF5flnSvZPslD3R8K2/eEUN4I/rprS3PCqyMuYNWO95z/D32Vlp\n377wdYvinXaEyhgANEeRY8acURkrT5ED+KPCWFhFyi7LqzLmfW/7e9g++NfNm3faESpjANAclQxj\nVMbKU5WpLbIMY2kqY979qEoY84ZKKmMA0ByVDGNUxspTlakt4sLY2pqZVsJFnpWxoHtZ5sUbKqmM\nAUBzVDKMURkrT1XGjMWFsZs33Y8P18+UJIzNz5u7CFAZAwBMqpJhjMpYeeoyZsy7ThzXz5QkjLVa\n5gbmRY8ZozIGAM1TyTBGZaw8RY4Zm6Sb0rtOHNduyvbgxhbersegiWe9+0FlDAAwqUqGsb17TeWB\nE03xiqqMuXZTtgPu/JWmMuYSMGdmpLm5YbtbW2aG/SqGMb6wAEBzVDKMtVrmZEMXTLH6/WpdTTk3\nZwKSX16VMdu2bbfbHX2/qHWL4A2VdOUDQHNUMoxJ5mTDt/5i3bplbj5dlaspo0KQXSfOrVvmxzVg\negNWVFepf90iUBkDgGaqbBijMlY8770P8+bSTbm0FPyaXe4ShOwxlKYyZh+j9oMB/ACASVU2jFEZ\nK5733od5c+mmDKtILSyYruwkYSzPypjrfGeT8lbG7Ez8/BsBgPqrbBijMla8Iitjk3RTTk2ZQOYS\nxpJ+Jm+1yyWMbW+bQf5F8F+IMDvLvxEAaILKhjEqY8UrsjIW100ZNaWE5D5eK+/KmHe9vPkvRODf\nCAA0Q6XDGN/6i1V0ZSxtN6XkHsaSfqbFRen6dbNvVQtj/soY1WMAaIbKhrH9+/nWXzT737vsSV9v\n3DD7UlZlTDJtr6+bELdvX/y6ebPTjlAZA4DmqWwYozJWvKRXHk5i714z3mp7e/y1uIqUfS2vypjd\nB5fqnF03b1tbJpBRGQOA5qlsGKMyVrwiK2N2Mteg6phrGPPetijMpJUxlzDmsh+TCgqVVMYAoBkq\nG8aojBWv6MqYFDxuLEllLG5aiaZUxoJCJZUxAGiGyoYxKmPFK3rMmDRZZezmzfhjJM/K2Py8ND1d\nTBijMgYAzVXZMEZlrHhFVsZsN+UklTHvumHyrIy1WuZG5mWFMb6wAEAzVDaMcaIpXhmVsaqFsXbb\nPK6txc91ZvejrG5KvrAAQDNUNozNzpourKCr7ZCPMsaMRXVT2mAUxDWMJe2mnJmR5uak1183M+tX\nJYxRGQNMlKbYAAAduElEQVSA5qpsGONGyMUr42rKsMrY3NxwnSB5VcZs25cvj75P1LpUxgAAk6hs\nGLMnT042xen1hjegzltcN6VLCLLrRun1zNiuqGAX1PalS6PvE7UulTEAwCQqG8ZsBYCTTXH8M7zn\nKa6bcmkpenv7uktlbHbWBDJXS0vDypjLfpRdGYub3gMAUG2VDWM2FBDGiuO/92Ge4rop4ypSCwsm\nYLlUxpJ+psXFYfhxrYzlHYjCKmM7O+G3lQIA1ENlwxhjxopXRmUsbRibmjKBzLUyloT3vV3C2Pa2\nGeyfp7DKmPc1AEA9VTaMURkrXpGVsbhuyrgQJLmN10pbGQv6PWrdvLsqwypj3tcAAPVU+TDGt/7i\nFFkZm7SbUnILY0VUxqT8wxiVMQBoLtcwtizpSUlHErR9TNIpSY8MHpNsy7f+EpRRGfOHsRs3pOvX\n3cNY3E26J6mMzc5K+/a5rZv3zcKDph2hegwAzeAyicERSb8q6QFJX3Ns95ikpyTZa9EWJF2W9KCk\ncy4N8K2/eFUYM+Yy+761uCh961vR60xSGXPdB6mYyph/2hHGVQJAM7hUxs5J+nzCdp+WqaRZGzLV\nsSdcG6AyVrzNzfLHjCUNY3mOGatSGAsKlVTGAKAZXLspE8zSpKMylbAXfcvPSzos6YBLI1TGitfr\nlT9mLE0Yi5pWoimVsaDPwRcWAGiGPAbwHx48XvItv+R7PRInmuKVURmbNIzdvBl9jORdGZufl6an\ni+mm9H8OvrAAQDPkEcbuHjx2fcvtEOdll0Y40RSvyMpYVt2U3m2C5F0Za7XMDc2pjAEA0srjLoT+\nEJZKUSeaL35R+uhHpfvuy7bdv/s76Z/+Sfqt38q23TfekB56KHiS0Z/5Gek3fiN920VWxmw35R//\nsfRXfzVc/s1vmsekYez97w9eJ00Ya7fd98GuR2UMAJBWHmHszcFj27fcXlnp776UJHW7XZ04ceLt\n5x//+IparZXcTzSf/7z0sz+bfRj78pelZ57JPox94xvSX/6l9EM/JM3NDZe/9pq5n2LaMNbvS1ev\nmm63IszNSb/0S9LFi2YqC2tpSfrUp6Q77ohvw6UylqabcmZG+sxnpE98wm39IsIYlTEAKFen01Gn\n05Ekra6uSuM5J7U8wtgZSY9rGL4s2z15Pmijdrs9EsYkc7LJ80SzsyNtbORzIl1fNyHj5s1hl1xW\n7UrS6dPSRz4yXP5rv2bCX1qbm2ZfXatBk2q1pP/7fydrI69uSslUTJPsB5UxAGi2lZUVraysSDLB\n7MqVK5n0BEr5jBl7Raar8n7f8nskXZS06trQ7Gy+J5qNDVMRyiuMeR+zbnfJF3UXF6Vu1wTMSdot\nKoxlIS6M3bplfvLueqUyBgCYRNIwFjTFxVGZkHXIs+whmYlfrbbMhK8PJ3mzvCtjeQWmPNu27bV9\nxdHFRRPErl6drN06hTEbSMP+G9sgn/dFCUtL5VTGmPQVAJrBdQb+hyX1JX1Spurln0X/gCTvafxZ\nmW7KUzLzjX1M0qclvZBk5/KujNU1jM3Pj87ELo1WiRYW0rXrbacOFhZMd2dcGCuqMtbvm/3JQ1Bl\nzM7IT2UMAOrNJYydU/QtjM5Kmg5Y/vTgJ7XZWSpjQe0GBSZvGDtwIF273nbqYGrKBLKw/8b22Mm7\nMra4KG1vmytc87oAIuxChLy/sAAA8pfHmLHMFNVNeeNGtu+ztTXsLiwjjKVt19tOXUSN1yqyMibl\n21UZdiFC3v9GAAD5q3QYK6qbUpLW1sLXS6rrub4iy3Zte4SxocXF8P/GRVbGpHzDGJUxAGiuSoex\nvL/1e0/iWZ5IvW0VXRlLG/7W1814pzTjzcpUpcpY1sHb6vepjAFAk1U6jBVZGWtKGEv7fmtrJohN\nVfqIGBcVxppSGbt50wSysMoYYQwA6q3Sp96ixoz5f69qu7a9oDA2NzfZDavD2q26KlXG8gpjUVN0\n0E0JAPVX6TBWRGVs377h71m2K5m2s2zXXmgQFJparckmH617GOv3x19rSmUs6nPQTQkA9VfpMFZE\nZWx5efh7lu1Kpu082g0LTbs1jN28GXycFFUZm5+frCoZJ+pzUBkDgPqrdBgrojJ2113RE4embVeS\nDh4kjOUtqipVVGWs1TJ3RKAyBgBIo9JhrIjK2B13RE8cmrbd2VnpzjsJY3mLCmNFVcbsflAZAwCk\nUekwNjtrJlDd3s6nfRtAsj6R5tmuFB6a0t4j0d4svWlhrKjKmN0PKmMAgDQqHcbyvBHyzo60sZF/\nGLt+3YxpyqpdKfvK2Oam2Ud74+062Q1hjMoYADRbpcOYPYnmcbLZ2DAVobzDmH2eVbtSdBjrdk3Q\nzLLdKovrpmy1pJmZYvaDyhgAII1KhzFbCcjjZOMNIE0KYzs7w/tiZtVulcVVxmZnTSArYj/KrIwF\nTe0BAKiHSoexPCtjdQ1j8/PSnj3Br6d9vzqHsYWF8Kthw+7nmAc7Xi+PUBRXGdvZMWMrAQD1VIsw\nVlRlLKsTaZ5hLCow7cYwNj0dfjVs2P0c87C4aC40uXYt+7bjKmPedQAA9VPpMJbnAH5/GLOz209q\na8t0E3rDWFY3kF5bI4wFCatsFlkZy3MW/rjKmHcdAED9VDqMFVkZ8y6bRLebT7u2HcLYuMXF4MBb\ndGVMyjeMRVXGCGMAUF+VDmNFDuD3Lsuq3XY7u3ZtO3mEsbU1M+5qYSH9vpWp6ZWxqBuF51k9BgAU\no9JhLO8B/DMz0m235RfGZmakubnqh7H1dRPEpip9NIQLC2NNqozNzJjxcX5UxgCg/ip9+s2zMra2\nZq6Aa7WGk51mGcZsm1leqRkXxubmzAk76Ri1us6+b1WpMpbV+ECvqM9BZQwA6q/SYSzvypg9geZV\nGbOPWbRrLzCICk2tVrr3a0oY818N26TKWNjnoDIGAPVX6TCW95ixOoUx10H2uzWM3bw5fpwUWRmb\nnzdVybzGjIV9Dqa2AID6q3QYK6oyFjVxaJp2JcJYkcLCdJGVsVbLXLBRdGWMqS0AoP4qHcaKqoxN\nTYVPHJqm3dlZad8+89zOzJ5Fu1J8aErzfk0NY0VWxux+UBkDACRV6TC2d6+pOORdGZOyrWDl1a5t\nL0rS9+v3mxvGiqyM2f2gMgYASKrSYazVMiebrE80OzvSxkZxYez6dTOmadJ2bXtRkn6OzU2zb00M\nY71eM8IYlTEAaLZKhzHJnGyyPtFsbJiKUFFhzC6ftF1ve2EWF81dAHZ2sm23yoL+G29vm1tTNaGb\nksoYADRbGWHsYJKV86iMBQWQuoQxO6t/mMVFE8SuXk3Wrp0XrY6C/htH3c8xz/0oujLGPGMAUH+u\nYeyYpFOSHpF0UtIhx+2WJe34fh5IsoN5VMbqGsbm5sxM7FGSvl8TKmP2Nk7ez2yPmTIqY/75ziYV\nVRnbs8f8UBkDgPra47DOMUlPSTog6S2ZytbLko5IeiVm20clHfc8X5P0XJIdnJ0trjK2tmZOpK1W\n+rbX1oLD2KQzs9s7BsTxhrEDB+LXb0IYm54evxq2rMrY9rZ07ZqZdywrm5vRoTKPfyMAgOK4hLEn\nJH1FJohJ0mVJZwfL74/Yri1pSdKXJtnB/fuLq4zZiUNvuy1du1tb5kScV2XMJTAlfT8bEuscxqTx\nKT3KqIx5b6uVZRiLuxAhj38jAIDixHVTHpaphJ3xLT8r6ahMtSzMYzJVtQsyXZuJxopZRVbGvK+l\n0e3m067dPo8w1oTKmDSsbFplVcak7MeNURkDgGaLC2NHB4+XfMvtaW85YtsLkr4qqS/TVXlRCceL\nScVWxryvZdWuHXBf5TDWag3HXdWVf8xfWWPGpGzDWL9PZQwAmi4ujN0xeOz6ltvnUWHsaUmflPRh\nSfcNtjmthBWyvCpjMzOj3ZF5hbGZGTPwvsphbGHB3IWgzvxhrCmVsZs3TSCL+hxUxgCg3uJOwW/G\nvB4z0cLbzkm6Z/D7McdtJOU3tcXS0uhAfe94n0na9bZlZXGlpmsYm5tLdsPqus++bzW1MubyOaiM\nAUC9xYUx2z3pD11t3+suLks6r+hq2pg8prbwX/Eo5VcZs88naffGDRNIXUJTqzU+fipK08KYnVai\nzMrYpFfOerl8DipjAFBvcVdT2rC1LOlVz3Jb+zmf8P3WZMaOjel2uzpx4sTbz1dWVrSyspJbZaxO\nYSzpIPsk79ekMOa9GraMytj8fLKqpAvXytibcTVsAMBEOp2OOp2OJGl1dVVy7x2MFRfGzssEsh/R\n6Pxg98nMNbaa8P2WZQb1j2m32yNhzMpr0tc77xxdtrBgqkp5hbELF7JvN4x/moe4tt///nT7VSXe\nMH3bbeVUxlotc8FGlmGMyhgAVIMtEkkmmF25csU/nj41l2Hbn5W5GtJeb9eWmfD1s551DstUvA56\nnp/W6Ez9j8pMcbGaZAeLqoxNTY1PHJqm3dlZad++0eVUxvLnr2yWURmz+1F0ZSyPLywAgOK4TPr6\nrMyVkE/IBK6PyQzCf8GzzpJMSLOBbV3DmfrPylTYntFoV6eT2Vkzmer2tukCykJYAMkiNOXVrm3H\nxeKi9Npr8ev1+80NY2VUxux+FF0Zy+MLCwCgOC5hTDJXQ56LeP2shtNgSGaw/r1pd8rLeyPkd7xj\n8vZ2dqSNjeLD2PXrZkzT3r3p2rXtuHD9HJubZp+aGMaojAEA6qLys0vZikBWJ5uNDVMRKjqM2dfT\ntuttJ87iorkbwM5Otu1WWVBlrNVKF34n3Q8qYwCAJCofxmxFIKuTTVQASTIlRJCgKTOk4bxjadu2\n27Udr9tYXDRB7OrV6PWaHMbsrPWT3PQ97X6UVRmz03oAAOql8mEs68pYXBiramVsbs7M5u/C9f2a\nFMbs7Zy8lbGiuyil8fnOJuVaGdvZMWMrAQD1U5swVlRlbJITaZ5hLElgcn0/W3Hz3zGgjqanR6+G\n3dwsfvC+ZP7bb29L165l05497uMqY951AQD1Uvkw5h3An4W4MGYnDk1qa8ucgPMKY0kC026sjEmj\nlc1er7zKmJRdV6U97uMqY951AQD1UvkwVnRlzLtOEt1uPu3a7fKojDUtjC0tDat9ZVXGsrjHqZfr\npK/edQEA9VL5MFZ0Zcy7Tlbt2oH3VQxjrdZwvFXdNbkyFnc7JO+6AIB6qXwYy6MyNjNjbpnjl1cY\nm5kxA/CrGMYWFszdB5rAG8bKHDMmZVsZm5mJnvCYyhgA1FvlT8N5VMaWloKnPJiki8luEza2a5Ir\nNZOGsbk5txtWN2X2fauplbG4z0FlDADqrfJhLI/KWFgAyasyZpenaffGDfPZk4SmVsvt/Zoaxvr9\nZlXG4j4HlTEAqLfKh7E8Jn2tUxhLO8h+aWl3hjF7NWxZlbH5ebeqpCsqYwDQfJUPY1lP+ho2S75k\nxk+1Ws0IYy53E2hiGJPM5yqrMtZqmQs2JrmTgxeVMQBovsqHsSIrY1NToxOHJm13dlbaty/49TLC\n2G6sjEnmc5VVGbP7QWUMAOCq8mFs715TbchyAH9UAJkkNOXVrt0+ibj36/ebHcbKqozZ/WDMGADA\nVeXDWKtlvvlncaLZ2ZE2NsoLY9evmwH5Sdu12ycR9zk2N834qqaGsd1UGcu6Kx8AUKzKhzHJnGyy\nONFsbJiKUFxoSjPeJ2osmm1XSn6StvuSJox1uyaABmna7PvS8LO88Ya5PdVuqYxl3ZUPAChWLcJY\nVpUxlwCSZ2XMuw9J2pWGs/i7Wlw0Qezq1eDX04a8KrOf5TvfMY9NCWOuY8YIYwBQT7UIY1lVxuoa\nxubmzCzsScS9XxMrY/a2TjaMld1N2e9P3lavFx8q9+wxP3RTAkA91SaMFV0ZS3oizTOMpQlMuzGM\nTU+bQPbtb5vnZVbGtrela9cmb8v1QoSs/o0AAIpXizC2f3+xlTE7cairrS1z4o1qN+2tlghjySwu\nDsNYmZUxKZuuStcLEbL6NwIAKF4twljRlTHvui663XzatevnGcbC7qVZV94wVmZlTMomjFEZA4Dm\nq0UYK7oy5l03q3btAPw0YSxNYHIJY63WcJxVUywtlT9mbJIbznv1+1TGAGA3qEUYy7IyNjMj3XZb\n+Dp5hbGZGTMQv0qVsYUFc9eBJllclG7dMr/XvTJ286YJZFTGAKDZanEqzrIytrRkKkJh0lQ1XLv8\n0lypmTaMzc1F37C6abPvW97PVPcxY/aYpzIGAM1WizCWZWUsLoDkVRmzrydp98YN87nThKZWy4TD\n3RzG6l4Zs8c8lTEAaLZahLEsK2N1CmOTXvEY9X67IYyVVRmbn4+uSrqiMgYAu0MtwliRlbGFBVNV\nIozVUxUqY62WuWCDyhgAwEUtwliWt0OKCyBTUyaQJQ1Ns7PSvn3R6xHG8leFypjdDypjAAAXtQhj\ns7NmYtXt7cnaibuZt5UmNOXVrt0ujbCbnvf7uyOMlVUZs/uR5obzXlTGAGB3cA1jxySdkvTI4PFI\nztuNsJWBSb757+xIGxvuoSnJiTRJyLt+3QzMd23XbpdGWPi7ft1Mm9D0MFb3ypgNV66VMcIYANTT\nHod1jkl6SpKduGFB0mVJD0o6l8N2Y2xloNeT3vGOJFsObWyYilDZlTG7/p13urXr3S6pxUVzd4Cd\nndH5xJp6KyRp+JlaLWnv3nL34/LlydqwXz5cK2N0UwJAPblUxp6W9KTn+YZMleuJnLYbYysDk3zz\nTxJAighjru1Kw9n7k1pcNEHs6tXgdpscxmZno+eTK2I/iqyM2W7KpDe4BwCULy6MHZWpaL3oW35e\n0mFJBzLeLpC3MpZWXcPY3JyZvT+NsPdrchizt3cqs4tSGh5Dk4SjJJWx/fvNe21tpX8/AEA54sLY\n4cHjJd/yS77XJ96ua++2HcCejIqujLmeSPMMY5MEpiaHsU6nE7h8etoEsjIH70vmv+32tnTtWvo2\nkg7g926DobBjBQjC8YIEUvZbjYsLY3cPHv1JyQ5vX85qu6gwlsUA/qRh7OZNtxPb1pY54RLGihX1\nB3NxsRqVMWmyrsqkU1t4t8EQJ1ckwfGCBDILY3ED+MMTUj7bBbLf+jc20p9sXn/dPCYJTd/9rvS+\n942+9vWvd/STP7ny9vM33kje7ve+F/057Hu4XqUZpNPpaHFxJfD9ov5bdDodraysOL9H3Lpx60S9\nnmRfvBYXy++us/9t/+ZvOvqVX1lJ1YYd6xdVGbP/jew63e6wqzYJ/3Gd9TYu6066Tthrt27VI6Sm\n+X9QxnukbaPo4yXNsSJxvGTdfhP/tsTNKZpG3BDnRyU9LtOt+Kpn+VFJz8tcMflcBtt19u3b9/Ef\n+7EfC9yJq1el8+dj9tTRT/zE6JWFQV5/Xfr3fw97dVVBQ96+//uld787ut1+X/r61+P30fse73yn\n9IM/6LKNr4XVVb33vQf0z/8cvs5P/uT4IPfV1VUdOHDA+T3i1o1bJ+r1tK/927+ZP6iHwzrRC9Dt\nSv/6r1LY8ZJE0P8ny/53iD5mXawq+X4m2cZl3UnXCXvNpd0qWFX++5nFe6RtI8l2LuvGrRP1etrX\nqmRV+e5nVu2naSfJNi7rTrrO6Gs/+qPSf/zHq9rY2Lji0LBkxtBvRK0QF8YOSXpZJkS94Fl+XNJJ\nme7G1Yy2u6DRSl1XGVfYMtJW/vuVxXukbSPJdi7rxq0T9Xra16ok7/3Mqv007RR9rMStE/Yax0q2\n78Hflmrgb4v7uln9bWlrtGvyDwc/QdZlQphkClS/G/P+sdZkqlxeT0p6LaftwhxPuR12j2MyoX5N\nJvQDUY7JVOo5XuCqLfM3BohyXNJHs270AY0efG2ZP17/zbPsqKSLMhWxJNu5WJCZm2wn4XbYXQ5r\nGP4PyRwvD5W3O6i4gxreEWRB5m9T5n880ThPSnqz7J1A5V2Q+aKX+TnoIY3e1ui/+14/Kmlb40Er\nbjtXB0UYQzT/rbZOarwyC4R5SdLtZe8EKu0Bmb8zE951FrvAAzLnoB1VqJK6LPNtIuy+lN77V57U\naHXN2wZhrPmyOFas55U+/KMekhwvUffGPT5YB8016bGyLFNcOCjC2G6Q1d+WBZkwVvrflwdkToo7\nCj4xHpM5sO03Unug+0+yhLHmy+pYkUyX+Ndy2EdUxxFJpxV/vFi2K9L/R9N73IX9QUW9ZXGs2Cr7\nsghjTZfV3xbL9hCWzo7fCfpQFyX9iW/ZKZk/jl6Esd0hi2NFMlUzupyaL+p4WZf0Bd+ykzLdkUEe\nV0X+YCIXkxwrx2W+/EmEsd0iy78tRzV+7hrjcqPwSYVNn3FY5gA/41t+VmbnD+S4T6imLI4V21f/\nVtY7h8oJO17S3Bv3bHa7hQpKe6wclAljZzQclN2WmRWACz6aa5Lj5aBG7zJ0TOYixEhxM/Dn6ejg\n0X//Su8tk1YL2xtUmeuxckymgmYnGj48eB452R4ax+XeuKsanVfokJjeYjeKO1YOSbrXs3xB0mVJ\nH855v1BNLn9blmTGmz0lM9/qSTlkmTLD2B2DR/8ka/b5ssyEsQuSPiupL1MyDJrxH83mcqy0NN7N\n9LKkj+W4X6gml3vj2nEhp2SqHuc1OkE1doek91FuyZyLsDvFHS8HZSZ3fTppw2WGsbi5WuwstxuS\nHh78YHdyOVbOqZhud1Sfy8zd52S+wWJ3SzozfFfDL4fYfXK740GZJy9b1vPf9bztex3gWEESNrz7\njxcbvjheYHGsIIncjpcqhDF/Gdh+qIxuDY4G4FhBEvZCD3/lyx4/HC+wOFaQRG7HS5lh7LzMSfZH\nfMvvkxnrs1r0DqGyOFaQxCsy3Qn3+5bfI3NBx2rRO4TK4lhBErU+Xg7LzNfxQMBrD8gMfLN3Nk97\n/0o0A8cKkog7XrK4Ny6agWMFSTTueDkiMy/LtsyM6EEf7IjMpZ/2tgKV/kDIDccKkjgicwxsS/qK\ngme/zureuKg3jhUkwfECAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAACw212U9NHB74/K\n3CUhC48Mfi4MHgGgkfaUvQMAau+gpFcHvy9r9L5taZ2WuS3Wl2RuzLsQvToAAMDuZG+Saz2vbO4Z\nuiPu9wZgl5gqewcA1Nq9ki4Nfm/JVMYuT9hm29MeAAAAAhyXqV6F/bwYse2jkh4f/Dwv6SHPaw9I\nOjVo43mZ8WdHQto5JOlJSS/JdJWeGWx3KmR/H/e0bdc7Mtj2iUE7h2TGwD0/2G558B6PDNb3joez\n739Bphv1tEyV8KXB88ODdtYC9unooK1HBus/HvIZAQAAIp2U9JnB7wcVP17stKQ/8Tw/KBOMHg9Y\n5tJNaQPYZyTdLhPmdgaP1lGNdqWekrQ9WN++38XBOp8ZbPuFwWsXPb9r0LY3PJ72vL9kQtiOTMCy\n+38oYJ92PO+/oOwuegAAALtISyZ02MrVMQ0rSkEOy4SQj/qWnxwst4P0l+Uexp6U9KZv2Y5GA5S9\nGMA6ErAfpwPakUzI8q635mv70UFbXi9Jei1mn3Y0GsAOBrw3gF2CMWMA0liXqS4d1mj34NHB758O\n2ObewWPXt/zs4PGelPsSNLas7fm9L+luz/P1weMljVrTuN+VtKrhNBv+toOsByzr+rZ7WKbr9IJM\nxWzScXYAaowwBiCNxcGPZP6OTEn6qkzAmJKZkiLMsu+5P5xl7cnBe9oK3r0yY8Tectj2uEzIfFLS\nFzPcp6c1DJ+nxZgxYFcjjAFI625J5z3Pl2W66MKcGTze71velqleRW0bpR/z+jmZStTDMuO93pT0\nmEO7yzJdiY9rGNyyusLzkKRXJH2fpN+R6e68PXILAI1FGAOQlndaC8mEF3/Xn9dlSU/JVJu8k7ge\nlwlK/kqVa/AJWs+77LikByV9RaYLsSXTvep3h++5reB9crC/xwaPd2i4//5twvap7VnWlvRbntee\nkgmULpU6AACAtz2u4aD0tswYMhePyAz0t9NbeAfqH9LwasevafQKRD97leS2hlczPiEzZu01DQfe\nH5UZm3VhsL6d2sJW4o572vmCRoOinQrjxcG+Pa7hVZdHBm3a97eBzb9PdpC/3af24PlXZCp1J5XN\nRLkAAACV9LikA75lB2VCEFcxAgAA5MhWxYI8FLIcAArFmDEATdaWGfv1iIZTSyxrOJAfAAAAOXtE\npjq2M3j8isa7LQEAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAaJr/DzdcJSXFPysDAAAAAElFTkSu\nQmCC\n", | |
"text": [ | |
"<matplotlib.figure.Figure at 0x113284450>" | |
] | |
} | |
], | |
"prompt_number": 146 | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": {}, | |
"source": [ | |
"Nice, many fewer dimensions to deal with!" | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"collapsed": false, | |
"input": [ | |
"cv.docvec_rdd.first()" | |
], | |
"language": "python", | |
"metadata": {}, | |
"outputs": [ | |
{ | |
"metadata": {}, | |
"output_type": "pyout", | |
"prompt_number": 147, | |
"text": [ | |
"('XFKJ1B||21nov1900||Mortgages & Liens||1900||8||CALDWELL, HENRY C.||THAYER, AMOS MADDEN||SANBORN, WALTER H.||CALDWELL, HENRY C.||CALDWELL||0||0||1',\n", | |
" SparseVector(512, {8: 1.0, 8: 4.0, 10: 1.0, 10: 1.0, 10: 2.0, 11: 1.0, 15: 1.0, 20: 9.0, 24: 3.0, 29: 1.0, 37: 3.0, 41: 6.0, 52: 1.0, 55: 1.0, 61: 3.0, 70: 1.0, 71: 1.0, 78: 1.0, 79: 1.0, 91: 1.0, 92: 2.0, 98: 1.0, 99: 1.0, 99: 2.0, 111: 1.0, 114: 7.0, 115: 10.0, 119: 2.0, 120: 7.0, 127: 2.0, 127: 6.0, 135: 1.0, 144: 2.0, 149: 1.0, 155: 1.0, 157: 2.0, 158: 2.0, 159: 1.0, 162: 3.0, 182: 3.0, 185: 3.0, 186: 1.0, 187: 1.0, 190: 2.0, 196: 1.0, 197: 1.0, 197: 2.0, 201: 1.0, 201: 5.0, 204: 2.0, 206: 1.0, 211: 1.0, 223: 8.0, 225: 1.0, 238: 1.0, 242: 2.0, 243: 1.0, 243: 2.0, 248: 1.0, 250: 2.0, 251: 3.0, 258: 2.0, 271: 1.0, 278: 1.0, 280: 10.0, 286: 1.0, 289: 1.0, 291: 1.0, 303: 3.0, 305: 2.0, 305: 3.0, 306: 2.0, 314: 1.0, 314: 6.0, 315: 2.0, 321: 1.0, 325: 1.0, 327: 2.0, 333: 2.0, 335: 3.0, 336: 1.0, 348: 1.0, 349: 1.0, 349: 7.0, 351: 10.0, 352: 2.0, 358: 1.0, 359: 1.0, 359: 4.0, 360: 1.0, 364: 1.0, 368: 1.0, 371: 2.0, 372: 1.0, 373: 1.0, 374: 1.0, 375: 2.0, 380: 1.0, 380: 4.0, 382: 3.0, 385: 1.0, 388: 3.0, 394: 1.0, 402: 15.0, 409: 2.0, 415: 17.0, 419: 1.0, 422: 1.0, 424: 1.0, 427: 3.0, 428: 1.0, 434: 1.0, 436: 1.0, 457: 1.0, 470: 1.0, 471: 4.0, 475: 8.0, 480: 2.0, 488: 1.0, 494: 1.0, 495: 1.0, 495: 2.0, 499: 1.0, 499: 1.0, 499: 2.0, 504: 1.0, 504: 1.0}))" | |
] | |
} | |
], | |
"prompt_number": 147 | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": {}, | |
"source": [ | |
"You can see the mapping between these vector indices and the vocabulary: " | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": {}, | |
"source": [ | |
"Then if you want to go back to the unfiltered data set, you can call the ``reset()`` method" | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"collapsed": false, | |
"input": [ | |
"cv.reset()" | |
], | |
"language": "python", | |
"metadata": {}, | |
"outputs": [], | |
"prompt_number": 162 | |
}, | |
{ | |
"cell_type": "code", | |
"collapsed": false, | |
"input": [ | |
"len(cv.ngram_rdd.first()[1])" | |
], | |
"language": "python", | |
"metadata": {}, | |
"outputs": [ | |
{ | |
"metadata": {}, | |
"output_type": "pyout", | |
"prompt_number": 163, | |
"text": [ | |
"1026" | |
] | |
} | |
], | |
"prompt_number": 163 | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": {}, | |
"source": [ | |
"## Output matrix" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": {}, | |
"source": [ | |
"Quick and dirty data output:" | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"collapsed": false, | |
"input": [ | |
"%%bash\n", | |
"mkdir temp" | |
], | |
"language": "python", | |
"metadata": {}, | |
"outputs": [], | |
"prompt_number": 165 | |
}, | |
{ | |
"cell_type": "code", | |
"collapsed": false, | |
"input": [ | |
"cd temp" | |
], | |
"language": "python", | |
"metadata": {}, | |
"outputs": [ | |
{ | |
"output_type": "stream", | |
"stream": "stdout", | |
"text": [ | |
"/Users/rokstar/Nbody/projects/bloomberg_ngrams/notebooks/temp\n" | |
] | |
} | |
], | |
"prompt_number": 166 | |
}, | |
{ | |
"cell_type": "code", | |
"collapsed": false, | |
"input": [ | |
"cv.apply_filter(nmin = 5, nmax = 10)" | |
], | |
"language": "python", | |
"metadata": {}, | |
"outputs": [], | |
"prompt_number": 173 | |
}, | |
{ | |
"cell_type": "code", | |
"collapsed": false, | |
"input": [ | |
"pairs = cv.docvec_rdd.collect()" | |
], | |
"language": "python", | |
"metadata": {}, | |
"outputs": [], | |
"prompt_number": 195 | |
}, | |
{ | |
"cell_type": "code", | |
"collapsed": false, | |
"input": [ | |
"from scipy.sparse import csr_matrix" | |
], | |
"language": "python", | |
"metadata": {}, | |
"outputs": [], | |
"prompt_number": 196 | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": {}, | |
"source": [ | |
"Get the total length: " | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"collapsed": false, | |
"input": [ | |
"lengths = np.zeros(len(data),'int')\n", | |
"for i,pair in enumerate(pairs) : \n", | |
" lengths[i] = len((pair[1].values))\n", | |
"total = lengths.sum()\n", | |
"print total" | |
], | |
"language": "python", | |
"metadata": {}, | |
"outputs": [ | |
{ | |
"output_type": "stream", | |
"stream": "stdout", | |
"text": [ | |
"43259\n" | |
] | |
} | |
], | |
"prompt_number": 198 | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": {}, | |
"source": [ | |
"Set up the arrays:" | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"collapsed": false, | |
"input": [ | |
"data = np.empty(total,'int')\n", | |
"indices = np.empty(total,'int')\n", | |
"indptr = np.zeros(len(pairs)+1, 'int')" | |
], | |
"language": "python", | |
"metadata": {}, | |
"outputs": [], | |
"prompt_number": 243 | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": {}, | |
"source": [ | |
"Get the data from the SparseVectors" | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"collapsed": false, | |
"input": [ | |
"n = 0\n", | |
"for context, vec in pairs : \n", | |
" l = len(vec.indices)\n", | |
" indices[n:n+l] = vec.indices\n", | |
" data[n:n+l] = vec.values\n", | |
" n += l" | |
], | |
"language": "python", | |
"metadata": {}, | |
"outputs": [], | |
"prompt_number": 244 | |
}, | |
{ | |
"cell_type": "code", | |
"collapsed": false, | |
"input": [ | |
"indptr[1:] = np.cumsum(lengths)" | |
], | |
"language": "python", | |
"metadata": {}, | |
"outputs": [], | |
"prompt_number": 245 | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": {}, | |
"source": [ | |
"Set up the sparse matrix" | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"collapsed": false, | |
"input": [ | |
"mat = csr_matrix((data,indices,indptr), shape = (len(pairs),vec.size))" | |
], | |
"language": "python", | |
"metadata": {}, | |
"outputs": [], | |
"prompt_number": 247 | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": {}, | |
"source": [ | |
"Save the data to a file:" | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"collapsed": false, | |
"input": [ | |
"np.savez('docvec_data.npz',data=data,indices=indices,indptr=indptr)" | |
], | |
"language": "python", | |
"metadata": {}, | |
"outputs": [], | |
"prompt_number": 248 | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": {}, | |
"source": [ | |
"Load the data and make a sparse matrix" | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"collapsed": false, | |
"input": [ | |
"loaded = np.load('docvec_data.npz')\n", | |
"mat2 = csr_matrix((loaded['data'],loaded['indices'],loaded['indptr']))" | |
], | |
"language": "python", | |
"metadata": {}, | |
"outputs": [], | |
"prompt_number": 257 | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": {}, | |
"source": [ | |
"## Shutdown Spark" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": {}, | |
"source": [ | |
"Before you exit your session, it's important to close everything down and clean up after yourself to avoid any zombie jobs running on the cluster. The ``if False`` is there to prevent accidental shut-down... remove it when you're ready to quit" | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"collapsed": false, | |
"input": [ | |
"if False: \n", | |
" os.system('%s/sbin/stop-all.sh'%spark_home)" | |
], | |
"language": "python", | |
"metadata": {}, | |
"outputs": [], | |
"prompt_number": 164 | |
}, | |
{ | |
"cell_type": "code", | |
"collapsed": false, | |
"input": [], | |
"language": "python", | |
"metadata": {}, | |
"outputs": [], | |
"prompt_number": 164 | |
} | |
], | |
"metadata": {} | |
} | |
] | |
} |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment