Limits...
Tools and techniques for computational reproducibility.

Piccolo SR, Frampton MB - Gigascience (2016)

Bottom Line: When those steps have been described in sufficient detail that others can retrace the steps and obtain similar results, the research is said to be reproducible.However, in practice, computational findings often cannot be reproduced because of complexities in how software is packaged, installed, and executed-and because of limitations associated with how scientists document analysis steps.No single strategy is sufficient for every scenario; thus we emphasize that it is often useful to combine approaches.

View Article: PubMed Central - PubMed

Affiliation: Department of Biology, Brigham Young University, Provo, UT, 84602, USA. stephen_piccolo@byu.edu.

ABSTRACT
When reporting research findings, scientists document the steps they followed so that others can verify and build upon the research. When those steps have been described in sufficient detail that others can retrace the steps and obtain similar results, the research is said to be reproducible. Computers play a vital role in many research disciplines and present both opportunities and challenges for reproducibility. Computers can be programmed to execute analysis tasks, and those programs can be repeated and shared with others. The deterministic nature of most computer programs means that the same analysis tasks, applied to the same data, will often produce the same outputs. However, in practice, computational findings often cannot be reproduced because of complexities in how software is packaged, installed, and executed-and because of limitations associated with how scientists document analysis steps. Many tools and techniques are available to help overcome these challenges; here we describe seven such strategies. With a broad scientific audience in mind, we describe the strengths and limitations of each approach, as well as the circumstances under which each might be applied. No single strategy is sufficient for every scenario; thus we emphasize that it is often useful to combine approaches.

No MeSH data available.


Example of a Make file. This file performs the same function as the command line script shown in Fig. 1, except that it is formatted for the Make utility. Accordingly, it is structured so that specific tasks must be executed before other tasks, in a hierarchical manner. See Additional file 2 for an executable version of this file
© Copyright Policy - OpenAccess
Related In: Results  -  Collection

License 1 - License 2
getmorefigures.php?uid=PMC4940747&req=5

Fig2: Example of a Make file. This file performs the same function as the command line script shown in Fig. 1, except that it is formatted for the Make utility. Accordingly, it is structured so that specific tasks must be executed before other tasks, in a hierarchical manner. See Additional file 2 for an executable version of this file

Mentions: When writing command-line scripts, it is essential to explicitly document any software dependencies and input data that are required for each step in the analysis. The Make utility [48, 49] provides one way to specify such requirements [36]. Before any command is executed, Make verifies that each documented dependency is available. Accordingly, researchers can use Make files (scripts) to specify a full hierarchy of operating system components and dependent software that must be present to perform the analysis (Fig. 2; Additional file 2). In addition, Make can automatically identify any commands that can be executed in parallel, potentially reducing the amount of time required for the analysis. Although Make was originally designed for UNIX-based operating systems (such as Mac OS or Linux), similar utilities have since been developed for Windows operating systems [50]. Table 1 lists various utilities that can be used to automate software execution.Fig. 2


Tools and techniques for computational reproducibility.

Piccolo SR, Frampton MB - Gigascience (2016)

Example of a Make file. This file performs the same function as the command line script shown in Fig. 1, except that it is formatted for the Make utility. Accordingly, it is structured so that specific tasks must be executed before other tasks, in a hierarchical manner. See Additional file 2 for an executable version of this file
© Copyright Policy - OpenAccess
Related In: Results  -  Collection

License 1 - License 2
Show All Figures
getmorefigures.php?uid=PMC4940747&req=5

Fig2: Example of a Make file. This file performs the same function as the command line script shown in Fig. 1, except that it is formatted for the Make utility. Accordingly, it is structured so that specific tasks must be executed before other tasks, in a hierarchical manner. See Additional file 2 for an executable version of this file
Mentions: When writing command-line scripts, it is essential to explicitly document any software dependencies and input data that are required for each step in the analysis. The Make utility [48, 49] provides one way to specify such requirements [36]. Before any command is executed, Make verifies that each documented dependency is available. Accordingly, researchers can use Make files (scripts) to specify a full hierarchy of operating system components and dependent software that must be present to perform the analysis (Fig. 2; Additional file 2). In addition, Make can automatically identify any commands that can be executed in parallel, potentially reducing the amount of time required for the analysis. Although Make was originally designed for UNIX-based operating systems (such as Mac OS or Linux), similar utilities have since been developed for Windows operating systems [50]. Table 1 lists various utilities that can be used to automate software execution.Fig. 2

Bottom Line: When those steps have been described in sufficient detail that others can retrace the steps and obtain similar results, the research is said to be reproducible.However, in practice, computational findings often cannot be reproduced because of complexities in how software is packaged, installed, and executed-and because of limitations associated with how scientists document analysis steps.No single strategy is sufficient for every scenario; thus we emphasize that it is often useful to combine approaches.

View Article: PubMed Central - PubMed

Affiliation: Department of Biology, Brigham Young University, Provo, UT, 84602, USA. stephen_piccolo@byu.edu.

ABSTRACT
When reporting research findings, scientists document the steps they followed so that others can verify and build upon the research. When those steps have been described in sufficient detail that others can retrace the steps and obtain similar results, the research is said to be reproducible. Computers play a vital role in many research disciplines and present both opportunities and challenges for reproducibility. Computers can be programmed to execute analysis tasks, and those programs can be repeated and shared with others. The deterministic nature of most computer programs means that the same analysis tasks, applied to the same data, will often produce the same outputs. However, in practice, computational findings often cannot be reproduced because of complexities in how software is packaged, installed, and executed-and because of limitations associated with how scientists document analysis steps. Many tools and techniques are available to help overcome these challenges; here we describe seven such strategies. With a broad scientific audience in mind, we describe the strengths and limitations of each approach, as well as the circumstances under which each might be applied. No single strategy is sufficient for every scenario; thus we emphasize that it is often useful to combine approaches.

No MeSH data available.