Earlier this week, I had a US-based student in a graduate-level marketing source discuss Slogar et al. (2023) in an assignment regarding entrepreneurial marketing. I read the article and pointed out the student the article is not necessarily related to marketing but entrepreneurial orientation and its influence on firm performance. Orientation, according to the authors, has three components:
However, I started looking at the article and its sources. Two sources popped out to me: Miller and Friesen (1981) and Covin and Sleven (1991). I read these articles in graduate school, so I decided to skim Slogar et al. to see how these researchers evaluated the sources and executed their research.
The first item I found of interest was Slogar et al.’s use of Country and Industry Effects as interval control variables. Country was defined on a 1-6 scale with each country being assigned to a number. Industry Effects was set a 1-8 scale with specific industries being assigned a number (with 8 representing “Other”). See the authors descriptive statistics (e.g., M, SD) relating to these categorical variables on p. 8 of their study. What wasn’t discussed by Slogan et al. were that a variable called Firm, which is defined as “the number of total employees within the firm” (p. 7), was treated as an ordinal variable (M = 1.7, SD = 0.73). Treating these categorical variables and ordinal variable as interval, rather than using dummy coding, makes the control variables invalid and the rest of their model incorrect.
Another concern I have is how Slogar et al. surveyed “organizations” using the Covin and Slevin instrument, which was originally directed to firm owners. How can organization take a survey? Slogar et al describe sending out surveys to 9,000 firms in southeast Europe and only receiving 963 usable responses; however, there is no statement whether there were controls in place to eliminate surveys completed by more than one person at a company.
There are a few more items I have concerns about but I would need to the data to confirm. I’ve requested a copy of the data from the authors, and have drafted a Letter to Editor. I’ll give them 60 days before I send the letter.
Note: The journal where this article was published, Administrative Sciences, is an Open Access Journal. According to their webpage, the journal (currently) charges USD$2,200 per article to be published. What’s the chance they would retract an article after accepting payment?
Miller, D., & Friesen, P. H. (1982). Innovation in conservation and entrepreneural firms: Two models of strategic momentum. Strategic Management Journal, 3(1), 1-25. https://doi.org/10.1002/smj.4250030102
Slogar, H., Milovanich, B. M., & Hrvatin, S. (2023). Does the relationship between entrepreneurial orientation and subjective financial firm performance have an inverted U-shape? Evidence from Southeast European SMEs. Administrative Scienses, 13(2), Article 26. https://doi.org/10.3390/admsci13020026
My oldest son asked me once why I blog about the quality of mentor-reviewed, student-completed research. I explained to him that the quality of research being published, social science or not, has been shown to fraught with errors.
I want to quote from Dr. Gelman’s last paragraph –
Research is not being done for the benefit of author or the journal; it’s for the benefit of the readers of the journal, and ultimately for society. If you don’t want your work to be publicly discussed, you shouldn’t publish it. We make criticisms in public for the same reason that we write articles for publication, because we think this work is ultimately of relevance to people on the outside
I was updating my ORCHID ID and SSRN account information a few months ago and was searching for something to write about. I found a recently completed dissertation on the relationship between corporate social responsibility and brand equity. Since marketing, specifically brand equity, is my domain, I was interested. During my reading, four issues immediately popped up in my head –
Descriptive statistics containing unimportant information
Exploratory factor analysis results being misinterpreted, while the important information was nowhere to be found
Decisions made about a scale based on an incomplete analysis
Statistical results normally use to reject a null hypothesis ignored, and an alternative test employed (with the results misinterpreted and not fully analyzed).
On page 62 of this study, the emerging scholar reported the M, SD, Skewness, and Kurtosis of each survey item. In the paragraph leading up to Table 6, the author states –
Considering the absolute value of skewness is less than +/- 3, it is acceptable but still highly skewed (Aminu & Shariff, 2014; Kline, 2015).
The author makes a similar statement about the Kurtosis with a different value (+/- 10). The skewness and kurtosis, which relate to distributional properties, are irrelevant at the item level. The properties are important when the items are combined to form a scale that would be measured (see Table 7, p. 64 of the study). Did the emerging scholar report the M/SD or distributional properties of the scales actually used in the study? Nope.
The purpose of an Exploratory Factor Analysis (EFA) is two-fold. First, an EFA can reduce a series of survey items to a small set of factors. Second, an EFA can provide support for convergent (construct) and divergent validity. The latter is demonstrated by reporting the dimensions (vertical column) based on the weight of the items (horizontal row). See an example below:
The emerging scholar stated she performed an EFA, which is fine; however, information similar to the example was not reported. What was reported was –
The principal component analysis, including the oblimin rotation was conducted and verified the Kaiser-Meyer-Olkin (KMO) measures of .808 for Corporate Social Responsibility, .857 for Multi-Dimensional Brand Equity, and .855 for Overall Brand Equity. These indicate a meaningful factor analysis as they are nearing 1.0 and are much higher than the .7 minimum suggested (Stevens, 2012).
Here’s where the misinterpretation begins. The KMO measure is used to assess whether the items are suitable for factor analysis; it does not support the meaningfulness of the analysis.
The emerging scholar performed both an EFA and reliability analysis on both survey instruments and their sub-dimensions. This was fine but somewhat redundant. An EFA would identify any potential non-reliable survey items by not loading the item in a dimension. Is somebody following a template or copying a process from another study??
Incomplete Scale Analysis
Composite scales were constructed. Box plots were used to identify outliers (outliers and extreme outliers were identified but were retained. I suspect they were retained because of the small sample size [N = 66]). Finally, both Kolmogorov-Smirnov (with Lilliefors correction) and Shapiro-Wilk tests of normality were used to assess whether the scales followed an approximately normal distribution (Spoiler alert: None did). The emerging scholar then decided to use nonparametric tests for hypothesis testing.
These two tests and others (e.g., Anderson-Darling, Cramer-von Mises) can be sensitive to outliers. Since the outliers (and extreme outliers) were retained, that decision may have caused the normality problem. Reviewing a Q-Q plot could have assisted in visualizing how the outliers influenced the distribution. Perhaps removing the outliers would have supported a decision to assume an approximately normal distribution. Who knows?
Over Analysis/Incorrect Analysis
In the hypothesis testing (pp. 68-71), it gets confusing. Really confusing…
For RQ1, Spearman’s Rank-Order Coefficient was used as the test statistic (probably due to the normality issue). The test was significant with a large/strong effect size (rs = .692, p < .001). That was all that was needed. However, the emerging scholar then moved to perform simple regression (she states multiple regression, but it wasn’t), where an r2 of 0.517 was calculated (r = .719). The validity of regression is dependent on several assumptions. One assumption is normally distributed errors. Without analysis of the errors, there is no validity that the linear model is correct. Why even do regression when one has already established a non-parametric relationship between variables? Also, the student makes reference to not being able to perform a t-test because of the non-parametric issues. A t-test?
For RQ2, the emerging scholar performed multiple regression using the four sub-dimensions of CSR as IVs with the DV being composite Brand Equity. The purpose was to determine which of the four dimensions was “most associated” with brand equity. What does “most associated” mean? The strongest relationship?
The emerging scholar reported both Spearman’s correlation coefficient and regression unstandardized/standardized beta values, along with t values and p values. Why? Who knows? No regression diagnostics charts were included to substantiate that a linear model was appropriate.
In RQ3, which is the funniest one to me, the emerging scholar somewhat mirrored RQ2 but flipped the direction and used the three brand equity dimensions as the IVs and CSR as the DV. If one is using multiple regression, one is trying to establish or confirm some type of cause and effect. Does Brand Equity cause Corporate Social Responsibility? However, only Spearman correlation coefficient values were reported. Why not multiple regression like in RQ2? Through all this, where are controls for Type I errors (see Bonferroni correction)?
Somebody got confused or lost along the journey, a standard template may have been used and boxes were being checked, the committee didn’t read the paper in detail, or the committee didn’t understand what was going on. The competence of committee members always concerns me. I just pointed out the highlights. If you review the emerging scholar’s data analysis plan you will find more comedy (I didn’t know a statistical test was calculating a Standard Deviation?).
More infrastructure supporting students and faculty is definitely needed.
I started writing this blog post about priming interviewees in qualitative research. However, once I got into writing, I realized I simply found another poorly performed qualitative study. However, I did want to discuss aligning research-deduced themes with research questions. Here’s the study –
Job Satisfaction and Job-Related Stress among NCAA Division II Athletic Directors in Historically Black Colleges and Universities
Name withheld (but you can search for the study)
I’ve been involved with many students who are exploring job satisfaction and job-related stress in a variety of industries, but I’ve never heard of a study on this topic in university athletic directors (AD’s). What surprised me was the study wasn’t quantitative; it was qualitative.
The emerging scholar’s overarching research question was –
What strategies do ADs at HBCUs implement to manage departments with limited resources?
What does the phrase ‘limited resources’ mean? It would seem that some form of quantitative measure would need to be used to separate athletic departments into categories based on resources. However, I found this sentence –
…there was an assumption that HBCU athletic directors would experience job dissatisfaction and job-related stress due to decreased funding, inadequate facility management, and inconsistent roster management
Wow! This statement makes it easy for a researcher…I’ll just assume something is happening whether true or not.
Now, a quick note about priming. The interview guide can be found on Appendix C of the dissertation. Honestly, it’s not really an interview guide. The student employed the ‘oral survey’ Q&A approach often suggested by faculty that have limited understanding of qualitative data collection methodologies. Rather than critique the self-described “interview questions,” I will point out one issue –
Q3 – What strategies have you implemented to motivate your staff and thereby increase job satisfaction?
This question requires the interviewee to –
Understand the word strategy or, at a minimum, understand the researcher’s definition of the term
Differentiate a strategy from a tactic
Reflect on how a strategy has been specifically applied to or influenced staff motivation
Reflect on staff responses to the strategy and subjectively estimate its influence on their own level of job satisfaction
In other words, the emerging scholar placed the responsibility for the study’s results on the interviewee responses, not on the interpretation of the responses. Ugh!
What would have happened if the emerging scholar simply started with –
How do you motivate your employees?
How do your employees respond to the techniques you employ to motivate?
When do you decide to change methods?
The aforementioned approach allows the interviewees to describe the methods they use to motivate employees, which would then be analyzed by the emerging scholar as a strategy or tactic. Each motivational technique could be explored in-depth by follow-up questions and, subsequently, tied back to the literature. Next, the emerging scholar could explore in-depth with the interviewee the responses by employees. Did the description provided by the interviewee align with the expectations found in the literature? Finally, discussing a change in methods and its impetus, could result in an alignment with the research question?
When I finally got to the themes, I chuckled:
Shared responsibility – “participants believed the workplace demands they face daily do not allow them to have the ability to make all decisions for the department. Having shared responsibilities among other leaders within the department was essential for each athletic director” (p. 97). Every job has some level of work demand. Some demands are based on the lack of resources (e.g., human capital), some are note (e.g., heavy lifting). In the academic literature, sharing responsibility within an organizational unit is the tenant of work-based teams. It would seem the study participants are simply employing widely-referenced management techniques. However, since the emerging scholar assumed all HBCU ADs face limited resources, this had to be a theme.
Empowering staff – The emerging scholar didn’t describe the meaning of this phrase; rather, paraphrased material was listed from external sources (two sources cited weren’t listed in the References). However, similar to shared responsibility, employee empowerment is an oft-studied topic in the literature.
Limited resources to grow facilities – The term ‘resources’ in this context relates to financial resources. ADs are often held accountable for promotion of their programs; however, how much of that job is part of their normal duties? Based on how the emerging scholar phrased the research question, this theme is not aligned with the research question.
Limited female participation – The emerging researcher delved into gender equity, the recruitment of females to play sports, and the balance between males and females in sports. This topic relates to recruitment, probably more about society than management…again unrelated to the research question.
In the emerging scholars biography she stated that she works for an HBCU athletic department, so I acknowledge the interest. She also stated that she would like to pursue an athletic department job. That’s great! If you, too, are an emerging researcher and you look at this study for references, that’s fine…just be wary about citing these results. Redo the research.
Yesterday, I briefly discussed face validity in the context of a student creating an instrument to measure a latent variable (e.g., usefulness, intention). Someone read my post and sent me an email asking “How would I measure face validity?” Well, face validity can’t be measured. Face validity answers the question – “Does this test, on the face, measure what is says to measures?” In other words, face validity is the perception that the test is appropriate or valid.
Why is face validity important? Rogers (1995) posited that if test takers believe a test does not have face validity, they would take it less seriously and hurry through; conversely, if they believe the test of have face validity, the test takers would make a conscientious effort to answer honestly.
I advise students (and faculty) to impanel a few experts in the domain under study and get their thoughts on whether the pool of items in the test appear to measure what is under study. If they agree, the first hurdle is passed. The next hurdle is to perform an exploratory factor analysis.
I emphasized the word pool in the prior paragraph for a reason. Developing a valid survey instrument takes time. One of the most time-consuming tasks is creating a pool of items that appear to form a measurable dimension. The reason why one has to create a pool is that until the survey instrument is distributed, feedback is received, and exploratory factor analysis is performed, there is no way to confirm which items strongly form a construct. For example, to get the 36-item Job Satisfaction Survey, Spector (1985) reported he started with 74 items.
Rogers, T. B. (1995). The psychological testing enterprise: An introduction. Brooks-Cole.
Spector, P. E. (1985). Measurement of human service staff satisfaction: Development of the Job Satisfaction Survey. American Journal of Community Psychology, 13(6), 693-713. https://doi.org/10.1007/bf00929796