Wrong analysis, over analysis, or just confusion?

I was updating my ORCHID ID and SSRN account information a few months ago and was searching for something to write about. I found a recently completed dissertation on the relationship between corporate social responsibility and brand equity. Since marketing, specifically brand equity, is my domain, I was interested. During my reading, four issues immediately popped up in my head –

  • Descriptive statistics containing unimportant information
  • Exploratory factor analysis results being misinterpreted, while the important information was nowhere to be found
  • Decisions made about a scale based on an incomplete analysis
  • Statistical results normally use to reject a null hypothesis ignored, and an alternative test employed (with the results misinterpreted and not fully analyzed).

Misreporting/Under-reporting Descriptive Statistics

On page 62 of this study, the emerging scholar reported the M, SD, Skewness, and Kurtosis of each survey item. In the paragraph leading up to Table 6, the author states –

Considering the absolute value of skewness is less than +/- 3, it is acceptable but still highly skewed (Aminu & Shariff, 2014; Kline, 2015).

p. 62

The author makes a similar statement about the Kurtosis with a different value (+/- 10). The skewness and kurtosis, which relate to distributional properties, are irrelevant at the item level. The properties are important when the items are combined to form a scale that would be measured (see Table 7, p. 64 of the study). Did the emerging scholar report the M/SD or distributional properties of the scales actually used in the study? Nope.

Exploratory Factor Analysis Under-reporting/Misinterpretation

The purpose of an Exploratory Factor Analysis (EFA) is two-fold. First, an EFA can reduce a series of survey items to a small set of factors. Second, an EFA can provide support for convergent (construct) and divergent validity. The latter is demonstrated by reporting the dimensions (vertical column) based on the weight of the items (horizontal row). See an example below:

Source: Qualtrics

The emerging scholar stated she performed an EFA, which is fine; however, information similar to the example was not reported. What was reported was –

The principal component analysis, including the oblimin rotation was conducted and verified the Kaiser-Meyer-Olkin (KMO) measures of .808 for Corporate Social Responsibility, .857 for Multi-Dimensional Brand Equity, and .855 for Overall Brand Equity. These indicate a meaningful factor analysis as they are nearing 1.0 and are much higher than the .7 minimum suggested (Stevens, 2012).

p. 64

Here’s where the misinterpretation begins. The KMO measure is used to assess whether the items are suitable for factor analysis; it does not support the meaningfulness of the analysis.

The emerging scholar performed both an EFA and reliability analysis on both survey instruments and their sub-dimensions. This was fine but somewhat redundant. An EFA would identify any potential non-reliable survey items by not loading the item in a dimension. Is somebody following a template or copying a process from another study??

Incomplete Scale Analysis

Composite scales were constructed. Box plots were used to identify outliers (outliers and extreme outliers were identified but were retained. I suspect they were retained because of the small sample size [N = 66]). Finally, both Kolmogorov-Smirnov (with Lilliefors correction) and Shapiro-Wilk tests of normality were used to assess whether the scales followed an approximately normal distribution (Spoiler alert: None did). The emerging scholar then decided to use nonparametric tests for hypothesis testing.

These two tests and others (e.g., Anderson-Darling, Cramer-von Mises) can be sensitive to outliers. Since the outliers (and extreme outliers) were retained, that decision may have caused the normality problem. Reviewing a Q-Q plot could have assisted in visualizing how the outliers influenced the distribution. Perhaps removing the outliers would have supported a decision to assume an approximately normal distribution. Who knows?

Over Analysis/Incorrect Analysis

In the hypothesis testing (pp. 68-71), it gets confusing. Really confusing…

For RQ1, Spearman’s Rank-Order Coefficient was used as the test statistic (probably due to the normality issue). The test was significant with a large/strong effect size (rs = .692, p < .001). That was all that was needed. However, the emerging scholar then moved to perform simple regression (she states multiple regression, but it wasn’t), where an r2 of 0.517 was calculated (r = .719). The validity of regression is dependent on several assumptions. One assumption is normally distributed errors. Without analysis of the errors, there is no validity that the linear model is correct. Why even do regression when one has already established a non-parametric relationship between variables? Also, the student makes reference to not being able to perform a t-test because of the non-parametric issues. A t-test?

For RQ2, the emerging scholar performed multiple regression using the four sub-dimensions of CSR as IVs with the DV being composite Brand Equity. The purpose was to determine which of the four dimensions was “most associated” with brand equity. What does “most associated” mean? The strongest relationship?

The emerging scholar reported both Spearman’s correlation coefficient and regression unstandardized/standardized beta values, along with t values and p values. Why? Who knows? No regression diagnostics charts were included to substantiate that a linear model was appropriate.

In RQ3, which is the funniest one to me, the emerging scholar somewhat mirrored RQ2 but flipped the direction and used the three brand equity dimensions as the IVs and CSR as the DV. If one is using multiple regression, one is trying to establish or confirm some type of cause and effect. Does Brand Equity cause Corporate Social Responsibility? However, only Spearman correlation coefficient values were reported. Why not multiple regression like in RQ2? Through all this, where are controls for Type I errors (see Bonferroni correction)?

Summary

Somebody got confused or lost along the journey, a standard template may have been used and boxes were being checked, the committee didn’t read the paper in detail, or the committee didn’t understand what was going on. The competence of committee members always concerns me. I just pointed out the highlights. If you review the emerging scholar’s data analysis plan you will find more comedy (I didn’t know a statistical test was calculating a Standard Deviation?).

More infrastructure supporting students and faculty is definitely needed.

Reference:

Smith, M. (2021). The relationship between corporate social responsibility and brand equity within the business-to-business service sector (Doctoral dissertation, Columbia Southern University). https://papers.ssrn.com/sol3/papers.cfm?abstract_id=3850480

Type I errors galore…

Legatti-Maddox (2019) explored the moderating effect of two leadership styles, transformational and transactional, on the relationship between four types of humor and Organizational Citizenship Behavior. The sample for this study were 42 MBA students.

First, the sample size (N = 42) concerned me since it seemed to be a bit small to find a practical (moderate) effect (f2 = 0.15). Using the pwr.f2.test() function from R’s pwr package (Champerly, 2020), it appears a sample of at least 73 would be required with three independent variables and a minimum power of .80 (see below).

          u = 3
          v = 72.70583
         f2 = 0.15
  sig.level = 0.05
      power = 0.8

So, it appears the study was underpowered by design. With underpowered studies there is a low probability of finding true effects and any effects could be false. Let’s move forward…

Second, when performing a moderation analysis, one has to enter both independent variables along with the interaction (see below)

y = X1 + X2 + X1*X2

If the moderating variable (X1*X2) is significant, then the interaction is explored and the independent variables generally lose their value from a research perspective. However, since no independent or interactive variable’s p-values were reported, no moderation evidence was provided by the emerging scholar. Instead. p-values of unmoderated and moderated models were compared, and an increase in the F-statistic reported as evidence of a moderating effect. That’s a flawed approach and, when the null hypothesis is rejected based on that approach, a Type I error ensues.

In a prior post, I discuss the uses of P-P Plots vs. Q-Q Plots and how it’s a default option in regression under SPSS. This emerging scholar used this plot (from SPSS) and stated that the homoscedasticity assumption was met.

Figure 1. Normal P-P Plot of Regression Standardized Residual (p. 70)

However, there is no reference to the independent variables or which model. I wonder what would have happened if her faculty advisor challenged her and said the residuals are hetroskedastic?

Finally, a quick look at a summary table in the study (Table 19 below)

A learned faculty should have counseled this student that the p-values would need to be adjusted for potential family-wise errors as the student’s premise is that all nine models are true. The widely-cited Bonferroni correction would result in a new p-value of 0.0055 (.05/9). If applied, only Model 9 may have met the criteria. However the focus of the study was not on whether a model could be constructed, but whether the interaction of humor and leadership explained the relationship better than the direct effects. Thus, more Type I errors.

The interaction of humor and leadership may influence OCB, but this study provides no evidence. The results of this study should be ignored.

Student Note: The way to approach this would be through some Structured Equation Model (SEM) that controls for Type I errors.

Reference:

Champely, S. (2020, March 16). pwr: Basic functions of power analysis. https://cran.r-project.org/web/packages/pwr/pwr.pdf

Legatti-Maddox, A. C. (2019). Humor style in the workplace as it relates to leadership style and organizational citizenship behavior (Doctoral dissertation). ProQuest Dissertations & Theses Global: The Humanities and Social Sciences Collection. (22622521)