Created
March 19, 2015 08:39
-
-
Save cmarat/d5c59c0f0fbb834a74bd to your computer and use it in GitHub Desktop.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
{ | |
"metadata": { | |
"name": "", | |
"signature": "sha256:e3b545bd63c280510a0335a5a4227bb4ab0a0c8086036bdbc699b17a414c603f" | |
}, | |
"nbformat": 3, | |
"nbformat_minor": 0, | |
"worksheets": [ | |
{ | |
"cells": [ | |
{ | |
"cell_type": "code", | |
"collapsed": false, | |
"input": [ | |
"from nltk.corpus import treebank" | |
], | |
"language": "python", | |
"metadata": {}, | |
"outputs": [], | |
"prompt_number": 1 | |
}, | |
{ | |
"cell_type": "code", | |
"collapsed": false, | |
"input": [ | |
"filter = lambda t: (\n", | |
" t.label()=='VP'\n", | |
" and len(t)>2\n", | |
" and len(t[2])==1\n", | |
" and t[2].label()[:2]== 'PR'\n", | |
" and t[0].label()[:2]=='VB'\n", | |
"# and t[1].label()=='NP'\n", | |
" and t[1][0].label()!='-NONE-'\n", | |
" )\n", | |
"tree_match = lambda s: list(s.subtrees(filter=filter))" | |
], | |
"language": "python", | |
"metadata": {}, | |
"outputs": [], | |
"prompt_number": 2 | |
}, | |
{ | |
"cell_type": "code", | |
"collapsed": false, | |
"input": [ | |
"matches = ((i, ' '.join(s.leaves())) for i, s in enumerate(treebank.parsed_sents()) if tree_match(s))" | |
], | |
"language": "python", | |
"metadata": {}, | |
"outputs": [], | |
"prompt_number": 3 | |
}, | |
{ | |
"cell_type": "code", | |
"collapsed": false, | |
"input": [ | |
"for i, s in matches:\n", | |
" print(\"[{}] {}\".format(i, s))" | |
], | |
"language": "python", | |
"metadata": {}, | |
"outputs": [ | |
{ | |
"output_type": "stream", | |
"stream": "stdout", | |
"text": [ | |
"[656] `` The effect will be * to pull Asia together not as a common market but as an integrated production zone , '' says 0 *T*-1 Goldman Sachs 's Mr. Hormats .\n", | |
"[675] `` They do n't want Japan to monopolize the region and sew it up , '' says *T*-1 Chong-sik Lee , professor of East Asian politics at the University of Pennsylvania .\n", | |
"[746] `` She just never gave it up , '' says *T*-1 Mary Marchand , Mary Beth 's mother ." | |
] | |
}, | |
{ | |
"output_type": "stream", | |
"stream": "stdout", | |
"text": [ | |
"\n", | |
"[933] * Filling out detailed forms about these individuals would tip the IRS off and spark action against the clients , he said 0 *T*-1 ." | |
] | |
}, | |
{ | |
"output_type": "stream", | |
"stream": "stdout", | |
"text": [ | |
"\n", | |
"[1208] In Detroit , a Chrysler Corp. official said 0 the company currently has no rear-seat lap and shoulder belts in its light trucks , but plans *-1 to begin *-2 phasing them in by the end of the 1990 model year ." | |
] | |
}, | |
{ | |
"output_type": "stream", | |
"stream": "stdout", | |
"text": [ | |
"\n", | |
"[1244] `` This is the peak of my wine-making experience , '' Mr. Winiarski declared *T*-1 when he introduced the wine at a dinner in New York *T*-2 , `` and I wanted *-3 to single it out as such . ''\n", | |
"[1427] Koito has refused *-1 to grant Mr. Pickens seats on its board , *-1 asserting 0 he is a greenmailer trying * to pressure Koito 's other shareholders into * buying him out at a profit ." | |
] | |
}, | |
{ | |
"output_type": "stream", | |
"stream": "stdout", | |
"text": [ | |
"\n", | |
"[1589] But when the contract reopened *T*-1 , the subsequent flood of sell orders that *T*-211 quickly knocked the contract down to the 30-point limit indicated that the intermediate limit of 20 points was needed *-128 *-128 to help keep stock and stock-index futures prices synchronized ." | |
] | |
}, | |
{ | |
"output_type": "stream", | |
"stream": "stdout", | |
"text": [ | |
"\n", | |
"[1775] `` Ideas are going over borders , and there 's no SDI ideological weapon that *T*-245 can shoot them down , '' he told a group of Americans *T*-1 at the U.S. Embassy on Wednesday ." | |
] | |
}, | |
{ | |
"output_type": "stream", | |
"stream": "stdout", | |
"text": [ | |
"\n", | |
"[1960] But can Mr. Hahn carry it off ?" | |
] | |
}, | |
{ | |
"output_type": "stream", | |
"stream": "stdout", | |
"text": [ | |
"\n", | |
"[2099] `` The purpose of the bill is * to put the brakes on airline acquisitions that *T*-2 would so load a carrier up with debt that it would impede safety or a carrier 's ability * to compete , '' Rep. John Paul Hammerschmidt , -LRB- R. , Ark . -RRB- said *T*-1 ." | |
] | |
}, | |
{ | |
"output_type": "stream", | |
"stream": "stdout", | |
"text": [ | |
"\n", | |
"[2361] Speculation had it that the company was asking $ 100 million *U* for an operation said * to be losing about $ 20 million *U* a year , but others said 0 Hearst might have virtually given the paper away ." | |
] | |
}, | |
{ | |
"output_type": "stream", | |
"stream": "stdout", | |
"text": [ | |
"\n", | |
"[2608] If * slowing things down could reduce volatility , stone tablets should become the trade ticket of the future ." | |
] | |
}, | |
{ | |
"output_type": "stream", | |
"stream": "stdout", | |
"text": [ | |
"\n", | |
"[2856] Last month , Phoenix voters turned thumbs down on a $ 100 million *U* stadium bond and tax proposition ." | |
] | |
}, | |
{ | |
"output_type": "stream", | |
"stream": "stdout", | |
"text": [ | |
"\n", | |
"[3105] Syndicate officials at lead underwriter Salomon Brothers Inc. said 0 the debentures were snapped by up *-1 pension funds , banks , insurance companies and other institutional investors ." | |
] | |
}, | |
{ | |
"output_type": "stream", | |
"stream": "stdout", | |
"text": [ | |
"\n", | |
"[3421] `` I sense that some people are reluctant *-2 to stick their necks out in any aggressive way until after the figures come out , '' said *T*-1 Richard Eakle , president of Eakle Associates , Fair Haven ," | |
] | |
}, | |
{ | |
"output_type": "stream", | |
"stream": "stdout", | |
"text": [ | |
"\n", | |
"[3751] Programs like Section 8 -LRB- A -RRB- are a little like * leaving gold in the street and then expressing surprise when thieves walk by *-2 to scoop it up *T*-1 ." | |
] | |
}, | |
{ | |
"output_type": "stream", | |
"stream": "stdout", | |
"text": [ | |
"\n" | |
] | |
} | |
], | |
"prompt_number": 4 | |
}, | |
{ | |
"cell_type": "code", | |
"collapsed": false, | |
"input": [], | |
"language": "python", | |
"metadata": {}, | |
"outputs": [], | |
"prompt_number": 4 | |
} | |
], | |
"metadata": {} | |
} | |
] | |
} |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment