This is the multi-page printable view of this section. Click here to print.
Concepts
- 1: Model
- 2: Module
- 3: Modelling project
- 4: Reproducible research
- 5: Transferability
1 - Model
A model is a simplified representation of a system of interest. In the way we use the term, we also mean that a model is:
- abstract and general (i.e. largely free of non-modifiable data, including numeric values, that are assumption- or context- specific) and
- a tool (i.e. a model can be used to help undertake an analysis, it is not the analysis itself).
If a model is developed primarily to inform a decision or set of decisions (e.g. relating to youth mental health policy and system design) it can be called a decision model.
Ideally, a model should have three inter-related representations - conceptual, mathematical and computational.
Conceptual Model
A conceptual model refers to underlying theory and beliefs about a system of interest that can be described in words and pictures.
Mathematical Model
A mathematical model formalises a conceptual model as a set of equations.
Computational Model
A computational model implements the conceptual and mathematical models of a system of interest as computer code.
ready4 is a computational model of youth mental health. More specifically, the ready4 computational model is the complete set of ready4 modules.
2 - Module
A modular computational model is one that constructed from multiple self-contained components, called modules. Each ready4 module describes a data structure and the set of algorithms that can be applied to it.
The advantages of developing ready4 as a modular model include:
-
each module can be independently re-used in other computational models of the people, places, platforms and programs important to the mental health of young people;
-
a complex model can be developed iteratively, beginning with a simple representation that is easier to validate and then progressively expanding in scope and functionality through the addition of new modules, to be validated independently and jointly; and
-
long term and resource intensive model development is more feasible through combining multiple independently managed and financed modelling projects.
To ensure that all ready4 modules can be safely and flexibly combined, each module is created from a template using authoring tools that support standardisation.
3 - Modelling project
As a complex, collaborative and long-term undertaking, it is not feasible for ready4 to be financed by a single funder or progressed as a single project. Instead, our mode of development is via multiple independent modelling projects, each with their own project governance and funding, but which adopt a common framework.
A ready4 modelling project will involve the three steps of:
-
Developing and validating a computational model;
-
Adding context-specific data to that computational model; and
-
Applying the computational model to the supplied data to undertake analyses.
The key components of each step are summarised here.
4 - Reproducible research
Although there is widespread support from the scientific community on the importance of reproducible research, the definition of key terms such as reproducibility and replicability can vary across disciplines and methodologies (e.g. the extent to which computational modelling is used). In some cases, entirely different terms (e.g. repeatability) are preferred. The meanings we intend when using these terms are described below.
Reproduction and Replication
The distinctions we make between reproduction and replication have been guided by the approach outlined in a report by the National Academies of Sciences, Engineering and Medicine. However, we have adapted their definitions slightly as the meanings in that report were framed in terms of study findings / outcomes, whereas our usage relates more to intended objectives when deploying tools.
Meanings
Reproduction
Applying the same analysis code to the same input data with the expectation of generating identical outputs (with the exception of trivial artefacts like datestamps for when analysis reports were produced).
Replication
Applying analysis code used in a study to new input data. The analysis code is reused with only minimal edits that are necessary to account for differences in input data paths and variable names and to study metadata (e.g. investigator names, sample descriptions). The new data can be real or fake, but will include the same structure and concepts / measures as those found in the original study’s dataset. If the new data is a sample from the same population as the original study, then the expectation when undertaking replications is for results across studies to be broadly similar.
Examples
Examples of both reproduction and replication code are available. When publishing analysis code we try to adopt (there are exceptions) the following rules of thumb:
-
If the data required to re-run a study analysis are publicly available (or declared by the analysis program itself), then we publish the code as a reproduction program (e.g. this program for creating a synthetic population).
-
If the data required to re-run a study analysis are not publicly available, we publish the replication version of the code. The replication version of the code may be configured to ingest a synthetic (fake) representation of the study dataset as with this utility mapping replication program. Details of the (minimal) steps required to revert the replication code to a version that can be used for reproduction purposes are typically embedded within the program itself.
5 - Transferability
It is common for discussions of scientific studies to consider the extent to which findings can be generalised (e.g. if a well conducted study concludes with high confidence that an intervention is cost-effective in Australia, is it valid to infer that it is likely to be cost-effective in the United Kingdom?). However, we are more interested in the transferability of computational models (e.g. the extent to which the data-structures and algorithms from a computational model developed for an Australian context can be used to explore similar topics in the United Kingdom). Our usage of the term “transferring” (and by extension “transferability”, “transferable”, “transfers”) reflects this motivation.
Transferring - our meaning
Adapting a computational model, in whole or in part, to extend the types of data and/or research questions to which it can be applied. The new types of data will possess some differences in structure and / or concepts from that to which the computational model had previously been applied and these differences may be why research questions need to be reformulated.
When we use the term transferring, we are typically referring to either (a) authoring or (b) using on of the following:
-
An analysis program (or sub-routine) that has been adapted from an executable from another study to account for differences in the input data / research question.
-
Inheriting data-structures and algorithms that selectively re-use, discard and replace elements of a study’s computational model based on an alternative use-case.
-
(Multi-purpose) function libraries that have been created by decomposing a study’s (single purpose) analysis program.
Examples
The scorz module library was originally developed to provide an R implementation of algorithms in other languages for scoring adolescent AQoL-6D health utility as part of a utility mapping study (which also used the analysis program mentioned above). Examples of all three approaches mentioned in the previous section can be seen by examining the documentation and source code of the scorz library:
-
Two vignette programs from the scorz library website score different utility instruments. The first program scores adolescent AQoL-6D health utility and acts as a template for the second, which has been modified to score EQ-5D health utility.
-
Inspecting those example programs shows that one of the key adaptations in the EQ-5D program is to use the ScorzEuroQol5 module instead of the ScorzAqol6Adol module. Both of these modules inherit from ScorzProfile. This arrangement means that all three modules share some features (in terms of both structure and algorithms) but selectively differ (e.g. aspects that are necessarily different for scoring different instruments).
-
The algorithms attached to each module from the scorz library are principally implemented by functions (the source code for which can be viewed here) that were created when decomposing an early draft of the above mentioned study algorithm. These functions are called by module methods (source code viewable here).