POL Scientific / JBM / Volume 6 / Issue 3 / DOI: 10.14440/jbm.2019.299
Cite this article
35
Citations
110
Views
Journal Browser
Volume | Year
Issue
Search
News and Announcements
View All
RESOURCE

FairSubset: A tool to choose representative subsets of data for use with replicates or groups of different sample sizes

Katherine K Ortell1 Pawel M Switonski2,3 Joe Ryan Delaney1
Show Less
1 Department of Biochemistry and Molecular Biology, Medical University of South Carolina, Charleston, SC 29425, USA
2 Departments of Neurology, Duke University School of Medicine, Durham, NC 27710, USA
3 The Duke Center for Neurodegeneration & Neurotherapeutics, Duke University School of Medicine, Durham, NC 27710, USA
JBM 2019 , 6(3), 1;
Published: 3 September 2019
© 2019 by the author. Licensee POL Scientific, USA. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution 4.0 International License ( https://creativecommons.org/licenses/by/4.0/ )
Abstract

High-impact journals are promoting transparency of data. Modern scientific methods can be automated and produce disparate samples sizes. In many cases, it is desirable to retain identical or pre-defined sample sizes between replicates or groups. However, choosing which subset of originally acquired data that best matches the entirety of the data set without introducing bias is not trivial. Here, we released a free online tool, FairSubset, and its constituent Shiny App R code to subset data in an unbiased fashion. Subsets were set at the same N across samples and retained representative average and standard deviation information. The method can be used for quantitation of entire fields of view or other replicates without biasing the data pool toward large N samples. We showed examples of the tool’s use with fluorescence data and DNA-damage related Comet tail quantitation. This FairSubset tool and the method to retain distribution information at the single-datum level may be considered for standardized use in fair publishing practices.

Keywords
statistics
normalization
automation
microscopy
References

1. Jones W. Longevity in a fasting spider. Science. 1884;3(48):4. Epub 1884/01/04. doi: 10.1126/science.ns-3.48.4-c. PubMed PMID: 17738099.
2. Lee JY, Kitaoka M. A beginner's guide to rigor and reproducibility in fluorescence imaging experiments. Mol Biol Cell. 2018;29(13):1519-25. Epub 2018/06/29. doi: 10.1091/mbc.E17-05-0276. PubMed PMID: 29953344; PubMed Central PMCID: PMCPMC6080651.
3. Ljosa V, Carpenter AE. Introduction to the quantitative analysis of two-dimensional fluorescence microscopy images for cell-based screening. PLoS Comput Biol. 2009;5(12):e1000603. Epub 2009/12/31. doi: 10.1371/journal.pcbi.1000603. PubMed PMID: 20041172; PubMed Central PMCID: PMCPMC2791844.
4. Weissgerber TL, Milic NM, Winham SJ, Garovic VD. Beyond bar and line graphs: time for a new data presentation paradigm. PLoS Biol. 2015;13(4):e1002128. doi: 10.1371/journal.pbio.1002128. PubMed PMID: 25901488; PubMed Central PMCID: PMCPMC4406565.
5. Kick the bar chart habit. Nat Methods. 2014;11(2):113. PubMed PMID: 24645190.
6. Schneider CA, Rasband WS, Eliceiri KW. NIH Image to ImageJ: 25 years of image analysis. Nat Methods. 2012;9(7):671-5. PubMed PMID: 22930834; PubMed Central PMCID: PMCPMC5554542.
7. Ghasemi A, Zahediasl S. Normality tests for statistical analysis: a guide for non-statisticians. Int J Endocrinol Metab. 2012;10(2):486-9. doi: 10.5812/ijem.3505. PubMed PMID: 23843808; PubMed Central PMCID: PMCPMC3693611.
8. Gyori BM, Venkatachalam G, Thiagarajan PS, Hsu D, Clement MV. OpenComet: an automated tool for comet assay image analysis. Redox Biol. 2014;2:457-65. doi: 10.1016/j.redox.2013.12.020. PubMed PMID: 24624335; PubMed Central PMCID: PMCPMC3949099.
9. Delaney JR, Patel CB, Willis KM, Haghighiabyaneh M, Axelrod J, Tancioni I, et al. Haploinsufficiency networks identify targetable patterns of allelic deficiency in low mutation ovarian cancer. Nat Commun. 2017;8:14423. doi: 10.1038/ncomms14423. PubMed PMID: 28198375; PubMed Central PMCID: PMCPMC5316854.
10. Data sharing and the future of science. Nat Commun. 2018;9(1):2817. doi: 10.1038/s41467-018-05227-z. PubMed PMID: 30026584; PubMed Central PMCID: PMCPMC6053389.
11. Guo Y, Logan HL, Glueck DH, Muller KE. Selecting a sample size for studies with repeated measures. BMC Med Res Methodol. 2013;13:100. doi: 10.1186/1471-2288-13-100. PubMed PMID: 23902644; PubMed Central PMCID: PMCPMC3734029.

Share
Back to top
Journal of Biological Methods, Electronic ISSN: 2326-9901 Print ISSN: TAB, Published by POL Scientific