Blog Archives

AI can’t solve all our problems, but that doesn’t mean it isn’t intelligent

A recent opinion piece I read on Wired called for us to stop labelling our current specific machine learning models AI because they are not intelligent. I respectfully disagree. AI is not a new concept. The idea that a computer could ‘think’ like a human and one day pass for a human has been around since … Continue reading AI can’t solve all our problems, but that doesn’t mean it isn’t intelligent

Posted in machine learning, PhD, philosophy, Work Tagged with:

Keynote at YDS 2015: Information Discovery, Partridge and Watson

Here is a recording of my recent keynote talk on the power of Natural Language processing through Watson and my academic/PhD topic – Partridge – at York Doctoral Symposium. 0-11 minutes – history of mankind, invention and the acceleration of scientific progress (warming people to the idea that farming out your scientific reading to a computer … Continue reading Keynote at YDS 2015: Information Discovery, Partridge and Watson

Posted in extraction, ibm, information, PhD, retrieval, scientific, watson, Work, yds Tagged with: ,

SSSplit Improvements

Introduction As part of my continuing work on Partridge, I’ve been working on improving the sentence splitting capability of SSSplit – the component used to split academic papers from PLosOne and PubMedCentral into separate sentences. Papers arrive in our system as big blocks of text with the occasional diagram, formula or diagram and in order … Continue reading SSSplit Improvements

Posted in demo, improvements, java, PhD, regex, split, sssplit, test, Work Tagged with: , ,

SSSplit Improvements

Introduction As part of my continuing work on Partridge, I’ve been working on improving the sentence splitting capability of SSSplit – the component used to split academic papers from PLosOne and PubMedCentral into separate sentences. Papers arrive in our system as big blocks of text with the occasional diagram, formula or diagram and in order … Continue reading SSSplit Improvements

Posted in demo, improvements, java, PhD, regex, split, sssplit, test, Work Tagged with: , ,

Tidying up XML in one click

When I’m working on Partridge and SAPIENTA, I find myself dealing with a lot of badly formatted XML. I used to manually run xmllint –format against every file before opening it but that gets annoying very quickly (even if you have it saved in your bash history). So I decided to write a Nemo script … Continue reading Tidying up XML in one click

Posted in PhD, processing, tidy, Work, xml Tagged with:

Tidying up XML in one click

When I’m working on Partridge and SAPIENTA, I find myself dealing with a lot of badly formatted XML. I used to manually run xmllint –format against every file before opening it but that gets annoying very quickly (even if you have it saved in your bash history). So I decided to write a Nemo script … Continue reading Tidying up XML in one click

Posted in PhD, processing, tidy, Work, xml Tagged with: