This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
#N.B. On *ubuntu RCurl may not install for you off the bat. If so read: http://www.omegahat.org/RCurl/FAQ.html & sudo apt-get install libcurl4-openssl-dev | |
install.packages(c("RCurl","twitteR","wordcloud","tm","stringr")) | |
library(twitteR); library(wordcloud); library(tm); library(stringr); | |
# Search for #mooc tweets | |
mooctweets <- searchTwitter("#mooc", n=2000) | |
length(mooctweets) # ends up with 713 as of 03-Jan-13 at 15:42 London time | |
# make into a data.frame | |
mooctweets_df <- twListToDF(mooctweets) |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Journal Name ISSN | |
Abstract and Applied Analysis 10853375 | |
Acta Crystallographica Section E 16005368 | |
Acta Electrotechnica et Informatica 13358243 | |
Acta Linguistica Asiatica 22323317 | |
Acta Medica Martiniana 13358421 | |
Acta Societatis Botanicorum Poloniae 16977 | |
Acta Universitaria 1886266 | |
Acta Universitatis Palackianae Olomucensis : Gymnica 12121185 | |
Acta Veterinaria Scandinavica 17510147 |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
egrep "(^Citations$|Cited Literature$|Literature [cC]ited$|Literatures cited$|Literature Cited\:$|References$|^references$|Refrences$|References [cC]ited$|REFERENCES$|Bibliography$|BIBLIOGRAPHY$|LITERATURE CITED$|LITERATURE cited$|REFERENCES CITED$|References \[not in Zootaxa format\]$|^Reference$|^Literature$|^References \(asterisks|^References \(except original descriptions|Litterature cited$|Literture Cited$)" |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
library(phangorn) | |
#264 REFERENCE trees in phylip format, PAUP numbering hence 2 | |
ref2 <- read.tree("jackr2.tre") | |
#264 trees in phylip format to pair-wise compare to the reference trees, TNT numbering hence 1 | |
tr2 <- read.tree("jack1.tre") | |
x <- {} | |
#all reference trees to one comp tree | |
for (i in 1:length(tr2)) { |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
<?xml version="1.0" encoding="UTF-8"?> | |
<opml version="1.0"> | |
<head> | |
<title>Ross's academic journal RSS feed subscriptions</title> | |
</head> | |
<body> | |
<outline text="General Biology Journals" title="General Biology Journals"> | |
<outline type="rss" text="BioEssays" title="BioEssays" xmlUrl="http://onlinelibrary.wiley.com/rss/journal/10.1002/(ISSN)1521-1878" htmlUrl="http://onlinelibrary.wiley.com/resolve/doi?DOI=10.1002%2F%28ISSN%291521-1878"/> | |
<outline type="rss" text="Biol J Linn Soc" title="Biol J Linn Soc" xmlUrl="http://onlinelibrary.wiley.com/rss/journal/10.1111/(ISSN)1095-8312" htmlUrl="http://onlinelibrary.wiley.com/resolve/doi?DOI=10.1111%2F%28ISSN%291095-8312"/> |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
curl -g --location --header 'Accept: application/x-bibtex' "http://dx.doi.org/10.1651/0278-0372(2005)025[0159:GR]2.0.CO;2" > test.txt | |
RETURNS | |
<h1>Internal Server Error</h1> | |
(I've encountered about 91 DOIs that appear to give this error) |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
I know I'm doing all types of wrong here: | |
Source HTML file here: http://mdpi.com/1420-3049/19/4/5150/htm | |
I want the text for the dc.source: | |
Molecules 2014, Vol. 19, Pages 5150-5162 | |
Am using beautiful soup, so probably best to do it in that BUT it should also be regex-able. I can do this in bash no problem! |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Thanks for your feedback Rod. I really value it. | |
I don't pretend to have all the answers. All of the academic content discovery | |
services are fairly murky about how they actually index things, | |
as I'm sure you know (Google Scholar perhaps being the most open-ish about how it does things?). | |
> how comparable are PLoS and Zootaxa from the perspective of search engines? | |
I am not a search engine. I am a human researcher. Whether a paper is | |
published in Nature, Science, PLOS ONE or Zootaxa, it is the same to me - |
We can make this file beautiful and searchable if this error is corrected: It looks like row 2 should actually have 34 columns, instead of 6 in line 1.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
img1,Actinokineospora_fastidiosa,Amycolatopsis_alba_DSM_44262,Amycolatopsis_albidoflavus,Amycolatopsis_azurea,Amycolatopsis_balhimycina,Amycolatopsis_benzoatilytica,Amycolatopsis_coloradensis,Amycolatopsis_decaplanina_DSM_44594,Amycolatopsis_echigonensis,Amycolatopsis_kentuckyensis,Amycolatopsis_keratiniphila,Amycolatopsis_keratiniphila_subsp._nogabecina,Amycolatopsis_lexingtonensis,Amycolatopsis_lurida,Amycolatopsis_marina,Amycolatopsis_mediterranei,Amycolatopsis_methanolica_239,Amycolatopsis_nigrescens_CSC17Ta-90,Amycolatopsis_orientalis,Amycolatopsis_palatopharyngis,Amycolatopsis_plumensis,Amycolatopsis_regifaucium,Amycolatopsis_rifamycinica,Amycolatopsis_rubida,Amycolatopsis_saalfeldensis,Amycolatopsis_sacchari,Amycolatopsis_sp.,Amycolatopsis_sulphurea,Amycolatopsis_taiwanensis,Amycolatopsis_thermoflava_N1165,Amycolatopsis_tolypomycina,Amycolatopsis_vancoresmycina,Prauserella_rugosa | |
img2,Antarctobacter_heliothermus,Donghicola_eburneus,Jannaschia_helgolandensis,Ketogulonicigenium_vulgare,Loktanella_salsila |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
line_number | msg | _id | _full_text | occurrenceID | catalogNumber | scientificName | scientificNameAuthorship | typeStatus | locality | country | waterBody | expedition | recordedBy | collectionCode | kingdom | phylum | class | order | family | genus | subgenus | specificEpithet | infraspecificEpithet | higherClassification | taxonRank | stateProvince | continent | island | islandGroup | higherGeography | habitat | decimalLongitude | decimalLatitude | geodeticDatum | georeferenceProtocol | maxError | verbatimLongitude | verbatimLatitude | minimumElevationInMeters | maximumElevationInMeters | minimumDepthInMeters | maximumDepthInMeters | recordNumber | individualCount | lifeStage | sex | preparations | identifiedBy | dateIdentified | identificationQualifier | eventTime | day | month | year | earliestEonOrLowestEonothem | latestEonOrHighestEonothem | earliestEraOrLowestErathem | latestEraOrHighestErathem | earliestPeriodOrLowestSystem | latestPeriodOrHighestSystem | earliestEpochOrLowestSeries | latestEpochOrHighestSeries | earliestAgeOrLowestStage | latestAgeOrHighestStage | lowestBiostratigraphicZone | highestBiostratigraphicZone | group |
---|
OlderNewer