The year 2022 has been a big one for the Computational Modelling Systems (CMS) team. We have seen some of our colleagues leave, after many years of working together; Paola Petrelli stepped up as the new CMS team leader; and we welcomed three new members: Ramzi Kutteh, Sam Green and Dale Roberts.
A year has passed that has felt busy and intense; but at the end of it, what have we done? When you are in a support role – as our team is – it is hard to quantify your work. We facilitate other people’s research outcomes, with contributions which are sometimes small and sometimes big.
During this period of transition in team members we still worked on specific projects, whose outcomes are detailed below. However, our biggest outcome was the continuous support we offered the researchers and students at the Australian Research Council (ARC) Centre of Excellence for Climate Extremes.
In the last calendar year we resolved more than 300 requests for help on the National Computational Infrastructure Climate and Weather helpdesk. The helpdesk requests cover a vast range of tasks: Someone might be looking for guidance, troubleshooting an analysis code, solving an issue with a model configuration, downloading or publishing a data set – and much more. While some of the answers to these questions might be quick, they all help someone in progressing with their research project.
We also offer hands-on help via one-on-one meetings and weekly Code Break sessions. Anyone can come to a Code Break session to ask a question or get help with a code. These sessions allow us not only to solve the specific issue the researchers came for, but also to offer them some personalised advice. For the last few months, we used the Code Break sessions to offer short, focused training at the start. Every week we cover a different topic, and we allow time for questions.
Both the helpdesk and the Code Break sessions give us an important insight into the research work which is happening at the Centre. They are our main contact with the researchers and students and a source of inspiration to decide future goals and longer-term projects. They help us identify areas for improvements in the available infrastructure and documentation. We can react quickly by offering new or improved code, documenting new model configurations and other processes and providing new data sets and services.
Some of the projects described below, such as the machine learning and the FrontDetection code collaborations, were identified and developed in this way. In particular, the use of machine learning in climate-related projects is growing across our research community and the international community, so the team is also training to support more of these projects in the future.
This year the Australian Community Climate and Earth System Simulator National Research Infrastructure (ACCESS-NRI), a computer modelling framework to support research with the ACCESS model was established. As the NRI’s modelling partly overlaps with ours, the CMS team has started meeting with them to identify areas of common interest as well as collaboration opportunities. In some cases, as with the Community Atmosphere Biosphere Land Exchange (CABLE) model, we are already working in partnership. Some of the services we maintain are of interest to the NRI and we are reviewing and consolidating them. They include the conda environments, the analytics database and our wiki-based documentation.
In fact, the reorganisation of our services, in view of the ARC Centre of Excellence for Climate Extremes moving towards the end of its funding period, has been a major focus for this year. It is important for us to have a long-term strategy so that the services and documentation we built for the last decade will survive, and wherever possible, to move to a community-based model.
We are trying to achieve this by relocating our documentation to new community- and/or project-based platforms and out of the current CMS wiki, which is UNSW based.
This is giving us a chance to also review and update content and to augment its relevance by making it more relevant to any researcher in the climate science community. Part of this work is happening in collaboration with other experts from CSIRO and the Bureau of Meteorology, as the documentation aims to build a best practice for the entire climate community.
CABLE Groundwater Module
Dr Mengyuan Mu made some contributions to the CABLE model groundwater module. Dr Ramzi Kutteh was given the task of integrating these into the official version of the model code, as a two-phase project. The first phase involved implementing Dr Mu’s work into the last version of CABLE, preceding a significant recent refactoring of the code.
This is now completed and the code has been successfully compiled and run for a simple test. Currently, more testing and validation is under way and, once this is completed, the second phase will involve the final integration into the latest refactored version of CABLE.
Alongside this task, Dr Kutteh is also participating in the CABLE model documentation project. This is led by the ACCESS-NRI CABLE team and it is an important and significant effort to improve the usability and provenance of this model.
Australian Community Reference Climate Data Collection @ the National Computational Infrastructure
In July 2022 the new Australian Community Reference Climate Data Collection @ the National Computational Infrastructure (NCI) was launched in collaboration with the Australian Climate Service, to reestablish and maintain a reference climate data set collection at NCI. Currently, the collection contains precipitation data sets, climate indices and some sea-surface temperature data sets. The data is hosted in the NCI project ia39 and can be accessed directly or programmatically using a Python Intake catalogue developed by Paola and Sam.
While some of these data sets were already available in one of the ARC Centre of Excellence for Climate Extremes projects, moving them to a longer-term project and sharing their maintenance with others in the climate community is an important step to ensure that the data will be available to the research community past the Centre’s life. It also gave us a chance to review our processes, making them more robust and transparent.
The FrontDetection is a Python-based module to detect atmospheric fronts, initially developed by Dr Malcom King. Various members of the team collaborated with Dr King to improve the code, and this year a new, improved version was released by Sam Green. The refactored code is now faster and more robust, and tests have been added so future changes that might affect the code reliability will be picked up and solved more easily.
Sam also improved the code documentation and added a tutorial explaining how to run a test case and produce plots. This code is now published on Zenodo and on part of the Centre of Excellence for Climate Extremes code collection.
Optimisation of a Machine-Learning Downscaling Climate Model
Sam Green helped Dr Sanaa Hobeichi and her UNSW group to speed up a downscaling climate model using a GPU-enabled machine learning algorithm. Dr Hobeichi developed the original code, but this was using only one GPU, so she needed some help to parallelise it.
Sam hadn’t used machine learning himself before this task, so he learned on the job through an online course and documentation. He also attended a NVIDIA-organised hackathon with Dr Hoebeichi. Sam improved the way the data was loaded by the code, so the data is now loaded only on GPUs, which was necessary to effectively scale the code. At the hackathon, Sam and Dr Hoebeichi learned how to profile the code, identifying other bottlenecks, which brought them to refactor the code to remove repetition and to run four instead of only one process per GPU. The next step will be to make the code run on multiple GPUs. The pair will also collaborate to produce a climate change machine learning tutorial for the NCI machine learning course series in 2023.
New Improved Documentation Resources
Paola Petrelli has been working on two collaborative projects to create new resources for the Australian climate community, both built using JupyterBook. This Python-based software allows anyone to build online books from a simple GitHUb repository. It is easy to contribute and there is no need to host a website.
The first of these projects, a handbook titled Climate Dataset Guidelines, provides guidelines for creating, managing, sharing and publishing climate data. With a focus on Australian climate research, this handbook has been designed to enable a common approach across the community. The handbook also includes a lot of practical tips on technical aspects of data management. The second project, a report titled Working with Big and Challenging Data Collections, focuses on solving challenges posed by analysing climate data which is ever-increasing in size and complexity. It covers topics from the basic knowledge necessary to understand how the data is structured and how this affects analysis, to useful analysis techniques and software, including a review of all platforms and tools available in an Australian context.
A One-stop Catalogue to Discover Climate Data in Australia
In 2021 Paola Petrelli co-hosted a climate-data-related workshop at the Australian Meteorological and Oceanographic Society conference. One of the main findings was that the climate science community struggled with finding data sets and other resources, as each organisation involved in the science uses different portals to share their data, and lots of data (as replicated data sets) are not even listed anywhere. A collaborative working group was then established to explore solutions to this problem.
This is how the Australian Climate Data Guide Catalogue was born. The portal is open to anyone looking for climate data sets and other resources (such as software, online resources and training) available in Australia. The portal lists information on the resources, with links to their official documentation where it exists. The portal is based on the Invenio Research Data Management package, but Paola configured it and adapted it to introduce climate-related search capabilities. We are currently seeking feedback before finalising the portal for production. While the portal’s main aim is to list resources for climate researchers, it has the potential to be expanded to also include resources useful to stakeholders.