THE ASCENSION OF FALSE POSITIVES
THE ASCENSION OF FALSE POSITIVES: MULTIPLE REGRESSION SHOULD BE GIVEN LESS EMPHASIS
Multiple Regression is a prestigious statistical strategy. Unfortunately it is flawed. This article will define the term “false positives” an “example” the” premier critic”, “non-fraud errors”, “possible corrections”, and “view from outsiders.”
The author suggests continuing multiple regression but also using less powerful but more accurate measures reported first followed by the beta weights of the multiple regression analysis.
Social sciences findings shaky
“The reliability of social science studies has been thrown into question, after an attempt to replicate project 21 high profile experiments yielded only 13 successful reproductions. The two year research project centered on studies that were published between 2010 and 2015 in highly respected journals Science and Nature…” THE WEEK, September 14,2018, 20
Introduction:
The author will attempt to construct how a researcher looks at conducting science from inside organized academic institutions. The reader should note other sources in science that try to improve this dilemma and how hard of a task the members have. (See above) Thus, the term is defined followed by an example and an introduction to a “premier meta-researcher, as well as examples of non-fraud errors, possible corrections and anoverview. The author has been a science writer for both a university and the federal government as well as a grant writer for a two year college ($1,000,000) He is also on the board of two academic journals and over the years a reviewer for many journals. (joelsnell.com)
Discussion
Valid research does not have false positives (type 1 error) or false negatives (type 2 errors). Thus, a false positive indicates a relationship between at least two variables when it is not accurate. A false negative suggests that two or more variables are not related, but in fact they co-vary. This is a problem of “multiple regression”. This means that a number of independent variables are a regressed on one or more dependent variables. An “analysis of variance” generally precedes the multiple regression and uses ratio analysis. The types of multiple regression are open regression, step-wise, and path analysis. The major problem is false positives. Further, blurring the results are mixing number intervals (Staff, 2018), mixing continuous and discrete numbers (Kerlinger, F. & E. Pedhauser,1973) and perhaps other non-fraud errors such as small differences in large samples. (Staff, Answers.com 2018)
Example
The author did a study of WHAT EVER HAPPENED TO THE CLASS OF 1985? The sample size was well over 200 respondents and results indicated that the beta weights were scattered. The variables used were ones common to education. The sample was taken from a Midwestern Junior and Community College. Without any information that might be helpful to the field, the author manually took each variable and used chi-square to analyze the data. The results were that those who graduated who had a “B” average or higher in high school, had time to study, and had the income to organize their life (social class) graduated in 5 years with a bachelor’s degree. The answers were common sense but useful to the field of education. For this study, multiple regression was not helpful. Further, this was a study that may become a file drawer paper that was not publishable, although academia should publish such studies and this one was published. (Snell & Green, 1987.)
Special Individual
John Ioannidis is the Premier of Meta-research as noted by Atlantic magazine (Friedman, 2010) a talent, he felt fortunate to be spurred on to look at the 25 most prestigious journals in medical science. What he found is that the articles when re-tested many were simply wrong. He then wrote a paper that was published in PLoS/medicine, 2005.
It is an online journal. It has been the most read article in the behavioral methodologies to date. The title is “Why Medical Research Findings Are False” by Ioannidis. He thought that it would be controversial. It was not. It was welcomed.
The field of the hard sciences and soft methodologies can see that there has been a resurgence of pre-modernism (divinity and trial and error) as well as meta-modernism (once called post modernism is now reorganized to deconstruct and question.)
This author a few years back attempted to do the same thing. In “Multi-regression and It’s Discontents” (Snell & Marsh, 2012) The author who in real time, was involved in the “Big Data Days” (early 70’s) of the social methodologies found that the emotion exceeded the intellect. The excitement was that soft sciences could become more prestigious with the use of the computer and software. The professor admitted that you must believe in hard numbers and that they could be applied to all variables. Stevens in the 40’s (Velleman & Wilkinson, 1993) introduced the terms nominal, ordinal, interval and ratio number levels. So hard number theory should absorb even nominal “dummy variables” and take on interval analysis (socialresearchmethods.net) No, that does not happen. Multiple regression became part of the curriculum. Students who wanted to be published discovered that the stretch to harder sciences was acceptable. This author did the opposite. The data was analyzed more simply and used a less powerful test.
Non-fraud errors
Fraud in science is a source of concern and it generally has severe consequences for the researcher. However this is a discussion of errors made inadvertently. What causes the errors?
1. Data entry mistakes. 2. Writing manuscripts to fit into multiple regression model.3. Or Burying or “file drawer dumping” with research that is not statistically significant. 4. Using small samples generally provide other than valid results 5.Small differences due to large samples are misleading. 6. Greater flexibility in design can cause false positives. 7. Who funds the study? They want the multiple regression study to show findings that the funders want. There are many more and the author urges the reader to go to the (Ioannidis, 2005) to gather other information.
Since Ioannidis’s article a number of other exotics have become apparent and were in Snell, Cangemi, Kowalski book Social Essays on Chaos Theory. (2008)
They are:
- Gaussian medians generally known by students are replaced with Mandlebrot medians based on key historic events like 9/11/2001. The key median made major changes in the society.
- Chaos theory’s black swans, chaotic butterflies, iterations, and related. This is discussed in “Multiple Regression and It’s Discontents” by the author and Dr. Mitchell Marsh.
- Serendipity is discussed in the same article and originates from Robert Merton Sr. last book with the same title (Merton, 2004) In other words, it tries to describe all those pre-modern and meta-modern exotics that can sully the results.
4. The senior Merton was “Mr. Sociology” of the 50’s and 60’s and was helpful to the author on two articles. (Snell, J. 2006)
Possible Corrections
There may be some strategies to help the validity of the field of Multiple Regression. They include:
- At least one journal that publish findings that are not significant and from the analysis of the article has used correct procedures. Thus this reduces the file drawer problem.
- Raw statistics of a Multiple Regression article are placed on a website so that reviewers can do their own analysis.
- Triangulation of the research topic. Thus the article has both quantitative and qualitative analysis of the topic.
- An independent analysis of research that is from another complimentary field. They may spot an error or raise questions that could help the article. A Professional Replicator’ name is on a manuscript. Thus the accepted article has been refereed and replicated.
- A journal that publishes reviews of literature of a topic that has been recently heavily published. What are their findings? It could also be a journal of replications.
Overview
Perhaps, future research has disclaimers and the necessity for replication. Further, on a print out that includes the nominal chi-square analysis may be more helpful. Further, multi-tabular chi-square where each individual independent variable is “wheeled” so that numerous findings occur and then ranked according to chi-square value is “cautiously accepted.” That means that future information is preliminary but less powerful. It still ranks as a publishable article.
Conclusion:
The “Big Data Days” began since the early 70’s. The rush to be a harder science is understandable. However, we have just begun. Numerous strategies and filters may be created so that this area of the behavorial methodologies becomes recognized and accepted. In the mean time, simple stats should be favored and publishable because they are more understandable yet less powerful. At least two parallel tracks can usher in new findings and new introduction to those who want but do not have the proclivity to handle more complicated information.
Reference Cited
Ioannidis, J. (2005) Why Most Research Findings Are False, PLOS, medicine, 9-30, 9-15.
Friedman, D. (2010) Lies, Damned Lies, and Medical Science, The Atlantic, November, 18-32.
Kerlinger, F. & E. Pedhauzer, Multiple Regression in Behavioral Sciences, New York: Holt, Reinhart, and Winston, 445.
Merton, Robert and E. Barber (2004) The travels and adventure of serendipidity , Princeton: Princeton University Press
Merton, R. Sr. (personal correspondence)
Snell, J. & D. Green (1987) An Appraisal of Transfer College Students: Shared Perceptions of Community College and University Professors” College Student Journal, summer, p. 153-157
Staff, (2018) Social Science Findings Shaky The Week, 9/14/ 20
Staff, (2018) Multiple Hypothesis Testing and False Discovery Rate, Answers.com/STATC141, 1-2
Staff, (2018) Fundamentals of Statistics 1: Nominal, Ordinal, Interval, and Ratio, UsableStats.com/ Lessons/ Noir, 218
Incidentally, multiple regression can be accomplished by making all the data (nominal, ordinal, interval, and ratio) into interval data which helps complete multiple regression analysis. At least, that is one interpretation.
www. joelsnell.com
Joel Charles Snell, MA
Professor Emeritus
Kirkwood College
Cedar Rapids, Iowa
52402
Apple Wood Hills
3105 Alleghany Dr. NE
319-366-0063
joelsnell@hotmail.com