Monthly Archives: November 2015

ElasticSearch: Turning analysis off and why its useful

I have recently been playing with Elastic search a lot for my PhD and started trying to do some more complicated queries and pattern matching using the DSL syntax. I have an index on my local machine called impact_studies which contains all 6637 REF 2014 impact case studies in a JSON format. One of the … Continue reading ElasticSearch: Turning analysis off and why its useful

Posted in analysis, elasticsearch, indexing, PhD Tagged with: ,

Freecite python wrapper

I’ve written a simple wrapper around the Brown University Citation parser FreeCite. I’m planning to use the service to pull out author names from references in REF impact studies and try to link them back to investigators listed on RCUK funding applications. The code is here and is MIT licensed. It provides a simple method … Continue reading Freecite python wrapper

Posted in citations, freecite, PhD, rcuk, ref, references Tagged with:

Scrolling in ElasticSearch

I know I’m doing a lot of flip-flopping between SOLR and Elastic at the moment – I’m trying to figure out key similarities and differences between them and where one is more suitable than the other. The following is an example of how to map a function f onto an entire set of indexed data in elastic using … Continue reading Scrolling in ElasticSearch

Posted in elasticsearch, lucene, PhD, results, scan, scroll Tagged with:

Keynote at YDS 2015: Information Discovery, Partridge and Watson

Here is a recording of my recent keynote talk on the power of Natural Language processing through Watson and my academic/PhD topic – Partridge – at York Doctoral Symposium. 0-11 minutes – history of mankind, invention and the acceleration of scientific progress (warming people to the idea that farming out your scientific reading to a computer … Continue reading Keynote at YDS 2015: Information Discovery, Partridge and Watson

Posted in extraction, ibm, information, PhD, retrieval, scientific, watson, Work, yds Tagged with: ,

SAPIENTA Web Service and CLI

Hoorah! After a number of weeks I’ve finally managed to get SAPIENTA running inside docker containers on our EBI cloud instance. You can try it out at The project was previously running via a number of very precarious scripts that had a habit of stopping and not coming back up. Hopefully the new docker … Continue reading SAPIENTA Web Service and CLI

Posted in docker, PhD, script, web, websockets Tagged with: ,