Last active
August 29, 2015 14:12
-
-
Save rossmounce/3a8d6ea07ec0017ce549 to your computer and use it in GitHub Desktop.
Reply to Rod Page (having technical problems posting this at PeerJ PrePrints)
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Thanks for your feedback Rod. I really value it. | |
I don't pretend to have all the answers. All of the academic content discovery | |
services are fairly murky about how they actually index things, | |
as I'm sure you know (Google Scholar perhaps being the most open-ish about how it does things?). | |
> how comparable are PLoS and Zootaxa from the perspective of search engines? | |
I am not a search engine. I am a human researcher. Whether a paper is | |
published in Nature, Science, PLOS ONE or Zootaxa, it is the same to me - | |
this is a logical and defensible position. I get what you're asking but as | |
I've never had a job at a search engine I'm afraid I don't have much insight | |
there. | |
> you used a complete set of Zootaxa PDFs obtained from the NHM? | |
yes, that information is in the paper. Metadata about those PDFs is in the | |
supplementary materials on figshare. As you know I cannot easily 'prove' I | |
had the full set of PDFs because copyright restrictions do not enable me to | |
repost the entire dataset, publicly online. This would infringe the copyright | |
of Magnolia Press. I can however repost the entire set of PLOS ONE articles | |
analysed as they were all published under CC BY or CC0. | |
> articles that are both open access and behind a paywall? | |
Yes. This is acknowledged in the paper. Regardless of whether a paper is open | |
access to the general public, it could still be privately indexed by content | |
search providers & that private full-text indexing made available during | |
search. Discoverability is not access. Paywalls can be made semi-permeable, | |
allowing known IP addresses through e.g. Google Scholar's indexing crawlers | |
and bots, whilst denying access to non-subscribers at other IP addresses. | |
> Perhaps a better question is how the open access subset of Zootaxa compares to PLoS? | |
I'm sorry if I didn't make the hypothesis I was testing clearer. I want to | |
test the discoverability of articles (regardless of OA or not). Yes, it does | |
seem reasonable to pre-suppose that open access articles might be advantaged, | |
but until we prove that with data I can't just make that assumption. If you | |
know of any other research that demonstrates superiority of discoverability | |
of OA research (not citation, views, downloads) then please let me know, I | |
should cite it in this paper. | |
> confounding different media (PDF versus HTML) with different degrees of access? | |
I agree. This could certainly be one of the causitive mechanisms of the | |
observed low recall of Zootaxa in Google Scholar. The point is, the observed | |
effect (poor discoverability in Google Scholar) is real regardless of the | |
cause [You're welcome to dispute the data given in the tables, but since I | |
did the searches only a few days ago I doubt the results have changed]. If | |
the cause is that Zootaxa does not provide HTML, then the obvious solution is | |
that Zootaxa should provide HTML full-text. Or just accept low | |
discoverability in Google Scholar :S | |
> Did you talk to Zhi-Qiang Zhang (editor of Zootaxa)? | |
Yes. I emailed him this morning. | |
I'm very pleased Magnolia Press have recently adopted DOIs, are moving the to | |
OJS platform, and have adopted the CC BY licence for hybrid open access | |
articles. These are all good moves towards better publishing. Given the | |
results here, perhaps they should also look at providing full text HTML or | |
XML, to continue their progress. They are an extremely important publisher of | |
taxonomy. | |
> You are making various statements about how you think search engines access | |
content, it would be interesting to actually know. | |
I agree, and also feel uncomfortable about the lack of evidence but services | |
like Scopus, WoK, MAS, MS *are* untransparent, proprietary, opaque systems. I | |
can't really change that. I certainly see that as a problem. Academia sorely | |
needs an open, transparent system of indexing peer-reviewed published content. | |
> ...there is a world of difference... | |
Yes. I agree there is vast difference in funding between fields. I'm not | |
entirely sure that difference prevents Magnolia Press from publishing full | |
text HTML on their OJS platform. Other, similar "shoe-string" (your words not | |
mine!) operations also produce full text HTML on OJS, albeit not quite at | |
the scale of Zootaxa & Phytotaxa. But surely this research could be used as | |
evidence to ask for more funding? Here is objective evidence showing that | |
more money is needed to do more useful taxonomic publishing to maximize | |
return on investment. (?) | |
Prior to this research I was not aware of anything (aside from cited papers | |
on OA citation, downloads, views advantage) that proves with real data that | |
publishing in PLOS ONE provides excellent discoverability of research (in | |
Google Scholar), substantially better than at other journals. That's why I've | |
published this. I think people need to know about this. I think it's | |
important. Incidentally this paper doesn't directly test whether | |
discoverability has anything to do with OA. That needs follow-up work to | |
demonstrate. | |
This is merely a first-pass demonstration that born-digital journal content | |
can have substantially different discoverability in academic search engines, | |
depending on where it's published (Making a conscious effort here not to | |
overstate what I've done). |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment