Biological and medical endeavors are beginning to realize the benefits of artificial intelligence and machine learning. However, classification, prediction, and diagnostic (CPD) errors can cause significant losses, even loss of life. Hence, end users are best served when they have performance information relevant to their needs, this paper’s focus. Relative class size (rCS) is commonly recognized as a confounding factor in CPD evaluation. Unfortunately, rCS-invariant measures are not easily mapped to end user conditions. We determine a cause of rCS invariance, joint probability table (JPT) normalization. JPT normalization means that more end user efficacious measures can be used without sacrificing invariance. An important revelation is that without data normalization, the Matthews correlation coefficient (MCC) and information coefficient (IC) are not relative class size invariants; this is a potential source of confusion, as we found not all reports using MCC or IC normalize their data. We derive MCC rCS-invariant expression. JPT normalization can be extended to allow JPT rCS to be set to any desired value (JPT tuning). This makes sensitivity analysis feasible, a benefit to both applied researchers and practitioners (end users). We apply our findings to two published CPD studies to illustrate how end users benefit. 1. Introduction Biological compounds and systems can be complex, making them difficult to analyze and challenging to understand. This has slowed applying biological and medical advances in the field. Recently, artificial intelligence and machine learning, being particularly effective classification, prediction and diagnostic (CPD) tools, have sped applied research and product development. CPD can be described as the act of comparing observations to models, then deciding whether or not the observations fit the model. Based on some predetermined criterion or criteria, a decision is made regarding class membership ( or ). In many domains, class affiliation is not the end result, rather it is used to determine subsequent activities. Examples include medical diagnoses, bioinformatics, intrusion detection, information retrieval, and patent classification. The list is virtually endless. Incorrect CPD output can lead to frustration, financial loss, and even death; correct CPD output is important. Hence, a number of CPD algorithms have been developed and the field continues to be active. Characterizing CPD effectiveness, then, is necessary. For example, CPD tool developers need to know how their particular modification affects CPD performance, and
D. B？hning, W. B？hning, and H. Holling, “Revisiting Youden's index as a useful measure of the misclassification error in meta-analysis of diagnostic studies,” Statistical Methods in Medical Research, vol. 17, no. 6, pp. 543–554, 2008.
R. Caruana and A. Niculescu-Mizil, “Data mining in metric space: an empirical analysis of supervised learning performance criteria,” in Proceedings of the 10th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD'04), pp. 69–78, August 2004.
V. García, R. A. Mollineda, and J. S. Sánchez, “Theoretical analysis of a performance measure for imbalanced data,” in Proceedings of the 20th International Conference on Pattern Recognition (ICPR'10), pp. 617–620, Istanbul, Turkey, August 2010.
M. Sokolova, N. Japkowicz, and S. Szpakowicz, “Beyond accuracy, F-score and ROC: a family of discriminant measures for performance evaluation,” in Proceedings of the AI 2006: Advances in Artificial Intelligence, pp. 1015–1021, July 2006.
A. S. Glas, J. G. Lijmer, M. H. Prins, G. J. Bonsel, and P. M. M. Bossuyt, “The diagnostic odds ratio: a single indicator of test performance,” Journal of Clinical Epidemiology, vol. 56, no. 11, pp. 1129–1135, 2003.
D. D. Blakeley, E. Z. Oddone, V. Hasselblad, D. L. Simel, and D. B. Matchar, “Noninvasive carotid artery testing. A meta-analytic review,” Annals of Internal Medicine, vol. 122, no. 5, pp. 360–367, 1995.
P. Baldi, S. Brunak, Y. Chauvin, C. A. F. Andersen, and H. Nielsen, “Assessing the accuracy of prediction algorithms for classification: an overview,” Bioinformatics, vol. 16, no. 5, pp. 412–424, 2000.
K. H. Brodersen, C. S. Ong, K. E. Stephan, and J. M. Buhmann, “The balanced accuracy and its posterior distribution,” in Proceedings of the 20th International Conference on Pattern Recognition (ICPR'10), pp. 3121–3124, Istanbul, Turkey, August 2010.
E. O. Cannon, A. Bender, D. S. Palmer, and J. B. O. Mitchell, “Chemoinformatics-based classification of prohibited substances employed for doping in sport,” Journal of Chemical Information and Modeling, vol. 46, no. 6, pp. 2369–2380, 2006.
P. Chatterjee, S. Basu, M. Kundu, M. Nasipuri, and D. Plewczynski, “PSP_MCSVM: brainstorming consensus prediction of protein secondary structures using two-stage multiclass support vector machines,” Journal of Molecular Modeling, vol. 17, no. 9, pp. 2191–2201, 2011.
P. Dao, K. Wang, C. Collins, M. Ester, A. Lapuk, and S. C. Sahinalp, “Optimally discriminative subnetwork markers predict response to chemotherapy,” Bioinformatics, vol. 27, no. 13, pp. i205–i213, 2011.
K. K. Kandaswamy, K. C. Chou, T. Martinetz et al., “AFP-Pred: a random forest approach for predicting antifreeze proteins from sequence-derived properties,” Journal of Theoretical Biology, vol. 270, no. 1, pp. 56–62, 2011.
T. Y. Lee, C. T. Lu, S. A. Chen et al., “Investigation and identification of protein-glutamyl carboxylation sites,” in Proceedings of the 10th International Conference on Bioinformatics. 1st ISCB Asia Joint Conference 2011: Bioinformatics, 2011.
M. Kulharia, R. S. Goody, and R. M. Jackson, “Information theory-based scoring function for the structure-based prediction of protein-ligand binding affinity,” Journal of Chemical Information and Modeling, vol. 48, no. 10, pp. 1990–1998, 2008.
O. G. Othersen, A. G. Stefani, J. B. Huber, and H. Sticht, “Application of information theory to feature selection in protein docking,” Journal of Molecular Modeling, vol. 18, no. 4, pp. 1285–1297, 2012.
A. M. Wassermann, B. Nisius, M. Vogt, and J. Bajorath, “Identification of descriptors capturing compound class-specific features by mutual information analysis,” Journal of Chemical Information and Modeling, vol. 50, no. 11, pp. 1935–1940, 2010.
J. Francois, H. Abdelnur, R. State, and O. Festor, “Ptf: passive temporal fingerprinting,” in Proceedings of the 12th IFIP/IEEE International Symposium on Integrated Network Management, pp. 289–296, Dublin, UK, 2011.
K. Nishimura, D. Sugiyama, Y. Kogata et al., “Meta-analysis: diagnostic accuracy of anti-cyclic citrullinated peptide antibody and rheumatoid factor for rheumatoid arthritis,” Annals of Internal Medicine, vol. 146, no. 11, pp. 797–808, 2007.
A. Martin, G. Doddington, T. Kamm, M. Ordowski, and M. Przybocki, “The DET curve in assessment of detection task performance,” in Proceedings of the 5th European Conference on Speech Communication and Technology, pp. 1895–1898, Rhodes, Greece, 1997.