Imperial News

FoNS Data Science Poster Competition: winners announced!

by Claudia Cannon

FoNS PhD students and early career researchers working in the area of data science submitted their poster entries before Christmas.

The judging panel nominated the top two best posters, and in addition the ‘popular choice’ award saw more than 170 Imperial students and staff voting for their favourite poster.

Data Science theme leads, Professors Guy Nason and Sophia Yaliraki said: "Thank you to all those who submitted entries for the competition - it’s great that you’ve shared your research with the Faculty in this way. The level of posters was extremely high and our panel of judges agreed that the final decision was not easy. Congratulations for the excellent science presented."

Congratulations to the winners!

Meet the participants

We caught up with some of the participants to learn more about their work, and to find out why they've chosen to focus their research in the area of data science.

Léonie Stroemich & Florian Song (PhD students, Chemistry)

Winners of the poster competition

ProteinLens: a user-friendly web-based application to uncover allosteric signalling in structural data

ProteinLens is a user-friendly web application which allows the user to analyse structural biological data in an interactive manner. Through this application our research group’s methods are made available to the wider community at the click of a few buttons, without requiring specialist knowledge or skills. Our graph theoretical models aim to shine a light on allosteric signalling, a process through which a molecule binding at one particular site of the protein can have an effect on a different site. This intra-molecular signalling can be utilised in a wide range of applications, such as drug discovery. By providing a computationally efficient and easy-to-use way of detecting and predicting allosteric sites and pathways, ProteinLens can guide experimental research and help users to explore their own protein project with a state-of-the-art approach.

Why do you find data science an exciting area of research?

Our group is made up of people from a wider range of scientific backgrounds, from physics and chemistry, via mathematics to molecular biology. What brings us all together is the love for data and method development to assess underlying structure and meaning. Our methods focus on the exploration of data with graph theoretical approaches. Those graphs can be constructed from a wide range of data types and biological molecules are only one field of application. To transfer methods rooted in mathematics (network theory) onto biological matter is the very definition of interdisciplinary science and what excites us the most.

Titus-Stefan Dascalu (PhD student, Physics)

Awarded the runner-up prize

Numerical study of proton beam transport through space-charge lens

Nearly 70% of cancer patients globally do not have access to radiation therapy as the treatment facilities are located predominantly in high-income countries. Thus, it is important for scientists to develop RT machines that are smaller, cheaper and more flexible in their use. 

My research is part of the LhARA (Laser-hybrid Accelerator for Radiobiological Applications) collaboration. LhARA aims to bring a step change in proton and ion radiation therapy by developing and demonstrating the technologies required for radiobiology studies in new regimes. A key component of the facility is a plasma lens. The device ensures the formation of the ion beam from the laser-driven target avoiding the limit on the beam intensity that pertains in conventional facilities. In order to build an efficient plasma lens, the ongoing research is focused on understanding the experimental observations obtained with a first prototype and on improving the design of the lens.

What excites you about the field of data science?

Both experimental data and numerical simulations are significant tools to understand the working of a device and to inform decisions for a better design. What is exciting for me is the chance to incorporate both in a research project such the physical data and the data from a model to corroborate the theoretical understanding.

Jenna Lawson (PhD student, Life Sciences)

Winner of the Imperial popular choice vote. 

Silent plantations of Costa Rica - Can forestry plantations support acoustic biodiversity as well as native forests? A big data approach.

Forests are home to 80% of the world's terrestrial biodiversity. It is estimated that we lose around 30 football pitches of forest every minute, most in the rich forests of the tropics. This loss is mainly due to agriculture, plantations and urban centres. Many species produce sounds for navigation, feeding and reproduction – more sounds indicates a greater diversity of species. By listening to the sounds of the forests and other disturbed areas, we can determine the effects of forest loss on wildlife, see what species we lose and even identify tipping points. We recorded in almost 400 locations across the diel cycle for 50,000 hours to look for these changes in species and ecosystem processes and found a significant loss of wildlife in palm and teak plantations, especially at dawn and dusk, where the typical forest, alive with sounds, was silent. 

Why have you chosen to focus your research in the field of data science?

Data science is essential for the expanding field of bioacoustics. We collected 50,000 hours of data, which would take a lifetime to listen to, so if we want to extract meaningful data and make important ecological inferences, then we need to employ data science processes.

Bryan Liu (PhD student, Mathematics)

What is the Value of Experimentation and Measurement?

We look at Experimentation & Measurement (E&M) capabilities, the knowledge and tools frequently used in the tech, marketing and e-commerce industries to experiment with different digital products, services, or experiences, and measure their impact. The capabilities may support the running of online controlled experiments (essentially randomised controlled trials in an online setting) or causal inference (the understanding of the cause and effect based on what one observed).

While E&M capabilities are often used to value other business propositions, the contribution of the capabilities themselves to organisational success is often not known. In our research we analyse how the capabilities reduce the uncertainty level when we estimate the values of other propositions, and hence leads to a better prioritisation. We quantify this benefit by calculating the improvement to the value of the prioritised propositions, and provide guidance for how much a capability is worth and when organisations should invest in one.

Why do you find data science so interesting?

Since I was young, I have been intrigued in how we make decisions based on our interactions with the world. The advent of the Internet and its related activities has yielded vast amount of data that allows us to obtain a better understanding of the world, and make less-biased decisions more quickly. My research looks at how we can accurately measure the impact of a given choice and experiment with possible alternatives, which is key to making good decisions.

My research is also motivated by my role outside Imperial - I study part-time and work part-time as a Machine Learning Scientist at ASOS.com. Being in both academia and industry enables me to work on theoretical problems motivated by industry challenges, utilise the (anonymised) data collected via different business processes, and see the practical benefit of the research quickly. This enables us to ask more questions, which accelerates scientific innovation.

Francesco Sanna Passino (Research Assistant, Mathematics)

Mutually exciting point process graphs for computer network modelling

In cyber networks, relationships between entities, such as users interacting with computers, or system libraries and the corresponding processes that use them, can provide key insights into adversary behaviour. Many cyber attack behaviours create new links, initiating previously unobserved relationships between such entities. In the poster, a novel model for point processes on networks is proposed to address two fundamental tasks in network security: network-wide modelling of event times, and anomaly detection in new connections.

Why have you chosen to focus your research in the field of data science?

The role of statistical and data science methods in cyber-security and cyber-defence applications has become increasingly important in recent years. In computer networks, a number of high-volume and high-frequency data sources are available, within which it might be possible to detect the presence of malicious activity in the network. Statistical methods can be used to complement more traditional signature-based methods, which are only able to detect attacks for which a signature has previously been created, in order to identify more subtle attacks. Statistical models have the advantage to learn from data, adapt and capture complex relationships between events.

Find out more about data science across FoNS

The data science theme provides a networking hub, training opportunities and events for researchers across FoNS interested in data science.

View all the submissions for the data science poster competition.

The FoNS research themes aim to get academics together to write strong proposals that address societal issues.