Speaker 1 - Yuri Aleksandrov, Aridhia -
Yuri is a data scientist at Aridhia, where he assists healthcare researchers with managing collaborative research, stratified medicine, and biomedical research through the use of biomedical informatics and analytics platform. He holds a degree in Criminology from the University of Abertay Dundee and MSc in Data Engineering from the Dundee University. Yuri is passionate about using data to solve problems in various areas ranging from retail to healthcare. He describes himself as “multipotentialite” and his interest range spans multiple areas, such as entrepreneurship, science and technology, in particular, machine learning and blockchain. He is always open to discuss projects and ideas, so feel free to get in touch at www.linkedin.com
"What can Data Scientists learn from DevOps"
This talk will cover how data scientists can integrate DevOps and agile methodologies in their workflows, to improve management of user/client requirements, quality and accuracy of data and outputs produced by their models. In recent years data science projects and models significantly grew in complexity (e.g Deep Learning), became fully operational and deployable stand-alone products and often involve a large team collaboration. This closely resembles traditional Software Development process, as opposed to "load your data and do some stats stuff". However, many data scientists emerge from non-software engineering backgrounds (including myself) and the caveats of working with code at scale are often learned on the go. Here, we will discuss how introducing DevOps like methodologies help to streamline projects, enable better collaboration and help to gain more from data science in business.
Speaker 2 – Mike Chantler, Heriot-Watt University
Mike is a professor of computer science at the Watt. He has worked in various forms of machine learning and visualisation over the last thirty years. He became passionate about helping people make better use of their data and ideas after spending lots of time participating in many advisory team meetings.
“Topic Modelling: the good, the bad and the ugly”
Topic modelling is an extraordinary public domain technology that is capable of generating highly intuitive overviews of, and browsing mechanisms for, large document sets. For instance, it is capable of automatically summarising and categorising hundreds of thousands of unstructured product descriptions or large project portfolio drawn from disparate sources.
It does this by automatically generating sets of highly intuitive categories, or ‘topics’, together with the categorisation of the documents (into the topics).
For instance two of sixty topics generated from a database of 35,000 UK research project descriptions are:
“Virus, Disease, Infection, Vaccine, Parasite, Chicken …”
“Climate, Change, Urban, Risk, City, Resilience, Infrastructure …”
And from the allocation of these topics to projects we can estimate that £169M of UK Research Councils spend was associated with the ‘Climate’ topic, while the equivalent monies for EU projects with UK involvement was £296M.
Note that this type of comparison often becomes extremely expensive, and not to mention highly political, if no data-driven mechanism for creating a common categorisation or classification is employed.
It has a wide range of uses from strategic planning to recommender engines.
In this presentation I’ll illustrate the power of topic modelling using a Brexit inspired example, give insight as to how it works, and point out some of the pitfalls and gotchas for the unwary.
There will be no maths or theory.
6:30 PM – 7.00 PM: Networking
7.00PM – 7:30 PM: Yuri Aleksandrov, Aridhia
7:30 PM – 8.00 PM: Mike Chantler, Heriot-Watt University
8.00 PM – 8:30 PM: Q & A session
8.30 PM – 9.00 PM: Networking and Drinks