Concepts

A number of concepts are helpful to understand prior to using ready4.

1: Model
2: Module
3: Modelling project
4: Reproducible research
5: Transferability

1 - Model

ready4 represents youth mental health systems in computer code.

A model is a simplified representation of a system of interest. In the way we use the term, we also mean that a model is:

abstract and general (i.e. largely free of non-modifiable data, including numeric values, that are assumption- or context- specific) and
a tool (i.e. a model can be used to help undertake an analysis, it is not the analysis itself).

If a model is developed primarily to inform a decision or set of decisions (e.g. relating to youth mental health policy and system design) it can be called a decision model.

Ideally, a model should have three inter-related representations - conceptual, mathematical and computational.

Conceptual Model

A conceptual model refers to underlying theory and beliefs about a system of interest that can be described in words and pictures.

Mathematical Model

A mathematical model formalises a conceptual model as a set of equations.

Computational Model

A computational model implements the conceptual and mathematical models of a system of interest as computer code.

ready4 is a computational model of youth mental health. More specifically, the ready4 computational model is the complete set of ready4 modules.

2 - Module

ready4 is comprised of self-contained, reusable components called “modules”.

A modular computational model is one that constructed from multiple self-contained components, called modules. Each ready4 module describes a data structure and the set of algorithms that can be applied to it.

The advantages of developing ready4 as a modular model include:

each module can be independently re-used in other computational models of the people, places, platforms and programs important to the mental health of young people;
a complex model can be developed iteratively, beginning with a simple representation that is easier to validate and then progressively expanding in scope and functionality through the addition of new modules, to be validated independently and jointly; and
long term and resource intensive model development is more feasible through combining multiple independently managed and financed modelling projects.

To ensure that all ready4 modules can be safely and flexibly combined, each module is created from a template using authoring tools that support standardisation.

3 - Modelling project

A ready4 modelling project develops a computational model, adds data and runs analyses.

As a complex, collaborative and long-term undertaking, it is not feasible for ready4 to be financed by a single funder or progressed as a single project. Instead, our mode of development is via multiple independent modelling projects, each with their own project governance and funding, but which adopt a common framework.

A ready4 modelling project will involve the three steps of:

Developing and validating a computational model;
Adding context-specific data to that computational model; and
Applying the computational model to the supplied data to undertake analyses.

The key components of each step are summarised here.

4 - Reproducible research

Some core concepts relating to reproducible research have multiple conflicting definitions - this is how we use them.

Although there is widespread support from the scientific community on the importance of reproducible research, the definition of key terms such as reproducibility and replicability can vary across disciplines and methodologies (e.g. the extent to which computational modelling is used). In some cases, entirely different terms (e.g. repeatability) are preferred. The meanings we intend when using these terms are described below.

Reproduction and Replication

The distinctions we make between reproduction and replication have been guided by the approach outlined in a report by the National Academies of Sciences, Engineering and Medicine. However, we have adapted their definitions slightly as the meanings in that report were framed in terms of study findings / outcomes, whereas our usage relates more to intended objectives when deploying tools.

Meanings

Reproduction

Applying the same analysis code to the same input data with the expectation of generating identical outputs (with the exception of trivial artefacts like datestamps for when analysis reports were produced).

Replication

Applying analysis code used in a study to new input data. The analysis code is reused with only minimal edits that are necessary to account for differences in input data paths and variable names and to study metadata (e.g. investigator names, sample descriptions). The new data can be real or fake, but will include the same structure and concepts / measures as those found in the original study’s dataset. If the new data is a sample from the same population as the original study, then the expectation when undertaking replications is for results across studies to be broadly similar.

Examples

Examples of both reproduction and replication code are available. When publishing analysis code we try to adopt (there are exceptions) the following rules of thumb:

If the data required to re-run a study analysis are publicly available (or declared by the analysis program itself), then we publish the code as a reproduction program (e.g. this program for creating a synthetic population).
If the data required to re-run a study analysis are not publicly available, we publish the replication version of the code. The replication version of the code may be configured to ingest a synthetic (fake) representation of the study dataset as with this utility mapping replication program. Details of the (minimal) steps required to revert the replication code to a version that can be used for reproduction purposes are typically embedded within the program itself.

5 - Transferability

Some models have the potential to be used in multiple contexts - but will often need adaptation for this to be appropriate.

It is common for discussions of scientific studies to consider the extent to which findings can be generalised (e.g. if a well conducted study concludes with high confidence that an intervention is cost-effective in Australia, is it valid to infer that it is likely to be cost-effective in the United Kingdom?). However, we are more interested in the transferability of computational models (e.g. the extent to which the data-structures and algorithms from a computational model developed for an Australian context can be used to explore similar topics in the United Kingdom). Our usage of the term “transferring” (and by extension “transferability”, “transferable”, “transfers”) reflects this motivation.

Transferring - our meaning

Adapting a computational model, in whole or in part, to extend the types of data and/or research questions to which it can be applied. The new types of data will possess some differences in structure and / or concepts from that to which the computational model had previously been applied and these differences may be why research questions need to be reformulated.

When we use the term transferring, we are typically referring to either (a) authoring or (b) using on of the following:

An analysis program (or sub-routine) that has been adapted from an executable from another study to account for differences in the input data / research question.
Inheriting data-structures and algorithms that selectively re-use, discard and replace elements of a study’s computational model based on an alternative use-case.
(Multi-purpose) function libraries that have been created by decomposing a study’s (single purpose) analysis program.

Examples

The scorz module library was originally developed to provide an R implementation of algorithms in other languages for scoring adolescent AQoL-6D health utility as part of a utility mapping study (which also used the analysis program mentioned above). Examples of all three approaches mentioned in the previous section can be seen by examining the documentation and source code of the scorz library:

Two vignette programs from the scorz library website score different utility instruments. The first program scores adolescent AQoL-6D health utility and acts as a template for the second, which has been modified to score EQ-5D health utility.
Inspecting those example programs shows that one of the key adaptations in the EQ-5D program is to use the ScorzEuroQol5 module instead of the ScorzAqol6Adol module. Both of these modules inherit from ScorzProfile. This arrangement means that all three modules share some features (in terms of both structure and algorithms) but selectively differ (e.g. aspects that are necessarily different for scoring different instruments).
The algorithms attached to each module from the scorz library are principally implemented by functions (the source code for which can be viewed here) that were created when decomposing an early draft of the above mentioned study algorithm. These functions are called by module methods (source code viewable here).