Webscraping Google Scholar & Show Result equally Word Cloud Using R


NOTE: Please encounter the update HERE and HERE!

…When reading Scott Chemberlain’s end post almost web-scraping I felt it was fourth dimension to choice upward as well as consummate an persuasion that I was brooding over for unopen to fourth dimension now:

When a scientist aims out for a novel projection the offset matter to create is to evaluate if other people already bring come upward along to respond the really questions he is almost to run on. I.e., I was interested if in that location has been done whatever enquiry regarding amphibian diverseness at regional/geographical scales correlated to environmental/landscape parameters. Usually I would got to Google-Scholar as well as search something similar – intitle:amphibians AND intitle:richness OR intitle:diversity AND surroundings OR landscape – as well as thence browse thru the results. But, this is oftentimes piece of cake as well as a agency for a quick visual seek out would last of non bad benefit.

The code I acquaint volition solve this task. It may last awkward inwards places as well as in that location powerfulness last a to a greater extent than effective agency to yield the same outcome – simply it may serve every bit a starter as well as I would really much appreciate people to a greater extent than literate than me picking upward the torch…

For my example-search it is shown that in that location has non been really much going on regarding amphibian diverseness correlated to surroundings as well as landscape…

See code HERE.

PS: I’d last happy almost collaboration / tips / editing – thence experience costless to contact me as well as I volition add together y’all to the listing of editors – y’all thence could edit / comment / add together to the script on Google Docs.

…some drawbacks demand to last considered:

  • Maximum no. of search results = 100
  • Only titles are considered. Additionally considering abstracts may yield to a greater extent than representative results.. simply abstracts are truncated inwards the search outcome as well as I don’t know if it is possible to remember the total abstracts.
  • Also, long titles may last truncated…
  • A to a greater extent than illustrative outcome would last achieved if i could larn rid of all other words than nouns, verbs as well as adjectives – don’t know how to create this, simply I am certain this is possible.
  • more drawbacks? y’all tell..

Sumber http://thebiobucket.blogspot.com