blue network

Software and Data at Imperial

Working with Data and Software at Imperial College

Resources and information on working with data and software at Imperial

Welcome to our guide on working with data and software for students, researchers and software engineers at Imperial. On these pages we provide a variety of information and pointers to resources across the College website, and beyond, to help you make the most of the available tools and services and work reliably, efficiently and securely with software and data.

Data and information, in a variety of types and formats, underpins all research in some way. Over several decades, software has also become an increasingly important element of research to the point where software is now present in almost all areas of research. Software provides a way to undertake and automate tasks that would be impractical or even impossible for humans to realistically handle themselves. As technology advances, the ability to capture ever-larger quantities of more complex, higher resolution data is more easily accessible to individual researchers. Software is now a vital element in being able to efficiently process, analyse and extract knowledge from this data, supporting the development of higher quality research outputs.

This material provides a top-level reference to highlight and help you find all the resources, groups and individuals around Imperial that can assist you in working effectively and efficiently with data and software.

Software and Data-related Information and Resources

Data

Storing Research Data

Research data includes not only data generated as outputs from, or inputs to, experiments but also various other forms of data and materials that relate to your research, e.g. slides, questionnaires, protocols, lab notebooks, videos, correspondence etc. With volumes of data growing constantly, finding a suitable location to store data that is both secure (in terms of protection against data loss and against unauthorised access) and easily accessible by the individuals/software that need access to the data for processing, can be challenging.

Things to consider:

  • Sensitivity: Is the data sensitive or personally identifiable? Does my chosen storage location provide the required level of security? Are special certifications required and are these present? (e.g. for certain types of medical data)
  • Sustainability: What retention policies apply to data at my chosen storage location? Will my data be removed after some period? Does this matter? Do I need a long-term backup elsewhere?
  • Sharing: Can I share the data with others?  Can I easily move my data from it's storage location to the location(s) where I want to process it? (e.g. Is the data set too large to transfer to a remote processing location in a reasonable amount of time?, etc)
  • Safety: Do backups of data need to be stored? If so, where? Might I need to access earlier versions of files?
  • Naming: Have I decided on appropriate descriptive and systematic names for my files?  (e.g. using appropriate and consistent time and date formats for timestamps used in file names)
  • Funder requirements: Have I reviewed any requirements mandated by my funder regarding data storage, long-term archiving, etc?

Data storage options available at Imperial:



Data management plans

A data management plan (DMP) describes how you will manage and look after your data throughout the lifecycle of your research project and beyond. Many funding bodies now expect or require researchers to submit a DMP when applying for funding or have one in place before data collection begins. Having a DMP is also good research practice. As well as satisfying funder requirements, a DMP can help you clarify ethical or legal responsibilities, prevent data loss and protect data confidentiality, make it easier for you find and keep track of your data during your project and prepare your data for archiving and sharing at the end of your project.

Available resources and information:



Research data management resources at Imperial

Imperial's Research Data Management team provide a wealth of online resources relating to working with, storing, sharing and archiving data in addition to information on policies, budgeting for research data management and writing data management plans. You can also contact them directly by sending an email to rdm-enquiries@imperial.ac.uk

The following links are available to help you at different lifecycle stages of your project:

Before you start:

During your project:

Finishing your project:

If you have further questions, take a look at the Research Data Management FAQs.
You can also contact the Research Data Management team for further information, training or 1-2-1 sessions



Data archiving and sharing

Many funders and an increasing number of journal publishers now expect data that supports published findings to be archived and made widely with as few restrictions as possible. The easiest way to do this is to deposit your data with a reputable data repository. Depositing with a data repository not only ensures the long term preservation and accessibility of your data, it also encourages others to reuse and cite your data and enables you to get credit for your data, just as you would any other published research output.

Things to consider:

  • Which repository is best for your data? Are there repositories that are widely used in your research domain? Does your funder recommend a specific repository?
  • Which data should you keep? Which data does your funder or journal expect you to archive and/or make available?
  • Will there be restrictions on sharing your data (e.g. to protect data confidentiality)? What measures can you take to ensure that any data sharing complies with legal and ethical obligations?
  • How will you promote the discovery and reuse of your data?

Available resources and information:



Working with/storing sensitive data

Working with sensitive data presents a number of challenges around secure storage, access and use. Such data is not uncommon in research work, particularly in areas such as medical research, and various groups at Imperial have extensive experience of working with and managing such data.

  • The Scholarly Communication team provide some general advice on storing sensitive and personal data.
  • Each Department or Division has a local Data Protection officer who should be a first point of contact for all data protection queries.
  • ICT provide some guidance on how to encrypt and protect your data. This is useful guidance but it is not focused specifically on large-scale research data for which you are likely to want to explore and familiarise yourself with other options for protecting your data.



Research data-related training opportunities within Imperial

The Research Data Management team run two data management training sessions for PhD students offered the through the Graduate School at various times throughout the academic year:

Also relevant for data management are the Graduate School’s Research Computing & Data Science Skills Courses.

The RDM team can also deliver bespoke training workshops to research staff and students as well as academic support staff. Email rdm-enquiries@imperial.ac.uk for further details.