Principal Component Analysis (PCA)

 

components_thin

 

Introduction

The process of Principal Component Analysis (PCA) performs dimensionality reduction on a set of data, and especially a scale that is attempting to evaluate a construct. The point of this process is to see if a multi-item scale can be reduced into a simple structure with fewer components (Kline, 1994).

For example, Sander & Sanders (2009) conducted a factor analysis of their original, 24-item Academic Behavioural Confidence (ABC) Scale, finding that it could be reduced into a 17-item scale with 4 factors which they designated as grades, verbalizing, studying and attendance. I have considered both the original 24-item scale and this reduced, 17-item, 4-factor scale in the analysis of the data collected in my project so far which has revealed interesting results that are reported elsewhere on this StudyBlog and in the project webpages.

Much like the well-used Cronbach’s ‘alpha’ measure of internal consistency reliability, factor analysis is ascribable to the dataset onto which it is applied and hence, the factor analysis that Sander & Sanders (ibid) used and which generated their reduced item scale with four factors was derived from analysis of the collated datasets they had available from previous work with ABC, sizeable though this became (n=865). The factor structure that their analysis derived, however, may not be entirely applicable more generally despite being widely used by other researchers in one form or another (eg: de la Fuente et al, 2013, de la Fuente et al, 2014, Hilale & Alexander, 2009, Ochoa et al, 2012, Willis, 2010, Keinhuis et al, 2011, Lynch & Webber, 2011, Shaukat & Bashir, 2016). Indeed, Stankov et al (in Boyle et al, 2015) in reviewing the Academic Behavioural Confidence Scale implied that more work should be done on firming up some aspects of the ABC Scale, not so much by levelling criticism at its construction or theoretical underpinnings but more so to suggest that as a relatively new measure (> 2003) it would benefit from wider applications in the field and subsequent scrutiny about how it is built and what it is attempting to measure. Hence conducting a factor analysis of the data I collected using the original 24-item ABC Scale is worthwhile because it may reveal an alternative factor structure that fits the context of my enquiry more appropriately and hence is also a response to Stankov’s remarks. I report more fully below about what emerged. Firstly however, I report my first factor analysis which has been applied to the data collected from my Dyslexia Index scale.

Factor analysis of Dyslexia Index

dx_logo3In my main research questionnaire, the Likert scale that is attempting to evaluate Dyslexia Index is a 20-item scale. I want to see if items within my scale are reducible into a set of factors. In this way, I can identify sub-scales within the main scale which I can explore independently with Academic Behavioural Confidence, thus addressing the main research hypothesis of my project, that ‘students with an unidentified dyslexia-like profile present a higher level of academic confidence than both their dyslexia-identified peers and their non-dyslexic peers‘. It is important to emphasize that this PCA process will enable the factor sub-scales to be related independently to data acquired through my other main metric (ABC) because the scale-item dimensions of the Dyslexia Index scale each appear in only one factor – hence none have cross-factoral influence.

I can also use PCA to help to identify scale items that I might consider to be redundant – that is, are not contributing to the evaluation of the construct in a helpful way and hence might be discarded. I have already interpreted the SPSS output for Cronbach’s Alpha internal reliability consistency analysis to do this which has tentatively suggested discarding 4 scale items from the Dyslexia Index, hence reducing it to a 16-item scale. However this needs to be explored a little further as even though I’ve run a repeat Cronbach’s α for the reduced 16-item scale, I have yet to explore the impact on α when combinations of the possibly redundant items are withdrawn. For the original, 20-item scale, Cronbach’s α = 0.842 and for the 16-item scale α = 0.889 which sounds better although a value α > 0.7 of doesn’t necessarily imply greater internal reliability consistency (Kline, 1986). I’ve written about ‘more than Cronbach’s α’ in another post. Suffice to say that effect size results for differences in ABC between research subgroups established through the 20-point Dx scale compared to those established through the reduced, 16-point Dx scale were only marginally different – details about this is also reported elsewhere on the StudyBlog.

I have also been reviewing the data visualization techniques that I developed earlier in the enquiry to produce the profile charts that summarize the 6 psychoeducational constructs that my QNR is also attempting to evaluate as additional, supporting metrics. It is likely that a deeper investigation of the meaning contained in those visualizations will be for a later project but the processes that I have learned and developed to create these profile charts has been transferable to other aspects of the project.  To this end, I have built test charts for individual respondents’ Dyslexia Index profiles that display them together with their overall Academic Behavioural Confidence measure on the reduced, 17-item scale together with values on the ABC factors that Sander’s PCA factor analysis produced. I will reflect on these fresh data charts in due course and determine the extent to which they can contribute to understanding the data and analysis or whether these, too, are for a later project.

Nonetheless this is important to mention because the first visualization profile was spikey and it seemed likely that an easier-to-understand profile might be emerge were the dyslexia dimensions grouped by apparent similarity rather than displayed in the order in which they were presented in the questionnaire. If it of note that this is effectively a ‘by eye‘ factor analysis and to enable this the dimensions were sifted into newly-designated categories thus:

Dimensions that are tentatively sifted into each factor are shown in the graphic below which is an example of one dyslexic respondent’s profile: respondent #ND-18801333 who presented a Dyslexia Index Dx = 714 and hence is in the research subgroup DNI, students with no declaration of dyslexia but who present a dyslexia-like profile, which is the research subgroup I am particularly interested in.  For this initial inspection, my reduced 16-item Dx scale has been used. As in previous displays of a similar nature, each scale-item location point on the chart indicates the extent of acquiescence with the dimension scale-item statement. In the representation below this indicates, for example, that the respondent strongly agreed with the scale-item statement: ‘At school I considered myself slower at learning to read than my peers’ but disagreed strongly with the statement: ‘At school I often mixed up similar letters in my writing‘.

 

dimensionsviz_2nd

 

The set of profiles built so far are available on the project webpages here. Each page displays the respondent’s Dx dimensions profile visualization (as above) together with their ABC-17, reduced item scale, overall value and the ABC-17 factor values. The header indicates the respondent ID prefixed by either ‘DI’ or ‘ND’ indicating whether their dataset is from the base group of dyslexic students or the base group of non-dyslexic students respectively.

‘By eye’ factor reduction is useful and may be sufficient in the context of my research but I will only know this if I work through the more conventional process of factor analysis.

 

Using SPSS:

Laerd Statistics has provided invaluable guidance for executing the process in SPSS for extracting factors.

Although there remains work to do in understanding more clearly the statistical process taking place to extract the factors that reduces the dimensions of the Dx Profiler, I have enough of a grip on it to be able to begin to interpret the results from several trials.

Several SPSS outputs have been generated, each one to explore the results of the dimension reduction that occurs by adjusting the calculation criteria. The summary of these is that using all 20 dimensions from the main research questionnaire and forcing SPSS to extract 5 dimensions has enabled an alternative to the ‘factors by eye’ graphic to be generated (presented below) which has alternative factor labels to be assigned. This visualization represents the same respondent (as above) who, on the 20-point scale, presented a Dyslexia Index of Dx = 683.

(compared with Dx = 714 on the 16-point scale. This is also interesting and prompted a quick analysis in the main, Excel data spreadsheet to explore the differences between respondents’ Dx values on the 16-point and 20-point scales. What has emerged is that overall, the 16-point Dyslexia Index scale has the effect of lowering the Dyslexia Index of respondents at the lower end of the Dyslexia Index range and elevating Dx at the higher end. This appears to be consistent with other interpretations so far (eg using Cronbach’s alpha) that by removing the four scale items 3.03, 3.05. 3.07 and 3.13, a more effective discriminator may be established because when left in the datasets, these four dimensions appear to be diluting the Dyslexia Index – that is, they effectively are reducing the variance between values. This is evidenced by the standard deviations of the respective sets of values: for the 20-point Dx scale in research group DI, the SD = 149.3 and the corresponding value for the 16-point scale is SD = 35.58. However for research group ND the situation is reversed with the corresponding standard deviations being: 20-point Dx scale, SD = 159.6  , 16-point Dx scale, SD =  185.7; this is puzzling. Does it mean that for students with previously identified dyslexia, the 16-point Dyslexia Index provides a more accurate determination of ‘level of dyslexia’ whereas for everyone else, the 20-point scale is keener? In due course I will refer back to the analysis of Cronbach’s Alpha for the research subgroups to try to interpret this apparent contradiction more clearly).

In the graphic below, which represents the same respondent’s Dx scale-item values now regrouped accordingly, the dimensions that constitute each factor are displayed in order of factor loading in a clockwise direction around the diagram. For example, in ‘Reading, Writing, Spelling’, the dimension that presented the highest factor loading is ‘I get anxious when asked to read aloud’ and the dimension presenting the lowest factor loading is ‘weak spelling‘. This is interesting and might be suggesting that to some degree at least, this dyslexic student at university may have developed spelling competencies to remediate previous weaknesses to become less troublesome – perhaps through use of spelling aids or assistive software applications – and that other reading and writing issues are now more significant.

Given the dimensional re-assignment that emerged out of this factor analysis I have also re-labelled the factors:

 

dimensions20viz_3rd

In interpreting the factor analysis outputs from SPSS I am learning that the table of communalities is the first output that is useful. The communalities are the proportions of each of the variables’ variances that is accounted for by the principal component analysis. So for example, for the first dimension in the table below 3.20: I get really anxious if I’m asked to read ‘out loud’, the communality value of 0.573 indicates that 57.3% of this dimension can be explained by the factors (components). The loading is the correlation between the variable and the factor and this is(are) the figure(s) presented in line with each dimension. According to the guidance provided in Laerd Statistics, the research convention is to pay serious attention to loading factors of > 0.32 (Dewbury, 2004, p309) with Dewbury adding a reference to earlier work by Comry & Lee (1992) which proposes that a loading of > 0.71 is ‘excellent’. Note that in the table below, only factor loadings > 0.3 are presented as SPSS conveniently obscures all the others when presenting its output for the analysis, which is why the row of data for dimension 3.20 only shows the one value of 0.829. There are loadings onto all the other factors but they were small. The communalities extraction figure of 0.573 is thus the proportion of this dimension’s variance that can be accounted for by all of the factors. These communalities need to be reported alongside the Rotated Component Matrix which is a table that groups the 20 dimensions into the 5 components/factors, where in each component, dimensions are listed in descending order according to the loading onto each variable. The table indicates ‘rotated’ components which is the mathematical process that places the factors in the best (geometrical) position to enable easier interpretation. SPSS uses a process it calls ‘varimax’ rotation which I’ve learned is an ‘orthoganal’ rotation process and this is when the factors are forced to be independent of each other (rather than taking into account correlations between them).

What emerges from this matrix (below) is that the factor structure isn’t quite as ‘simple’ as I would have wished because some dimensions load onto more than one factor (given the convention of loading > 0.32 indicating an influence that should be taken seriously). However, where this occurs, I have assigned the troublesome dimension to the factor onto which its loading is greatest – that is, where there is the greatest correlation between the dimension and the factor. I will reflect on this but texts I have consulted about the process of Factor Analysis more generally tend to agree that more often than not, a single, simple factor structure is elusive and it remains the task of the researcher to establish the most appropriate interpretation of the analysis that makes sense in the context of the project.

Rotated Component Matrix for Dyslexia Index, 20-point scale
item #  item statement Factor Communalities
1 2 3 4 5 Extraction
reading, writing, spelling thinking & processing organization & time-management verbalizing & scoping working memory
3.20 I get really anxious if I’m asked to read ‘out loud’ 0.829 0.573
3.08 When I’m reading, I sometimes read the same line again or miss out a line altogether 0.809 0.506
3.01 When I was learning to read at school, I often felt I was slower than others in my class 0.723 0.699
3.06 In my writing I frequently use the wrong word for my intended meaning 0.634  0.436 0.550
3.09 I have difficulty putting my writing ideas into a sensible order 0.609  0.337 0.321 0.639
3.02 My spelling is generally very good (reverse-coded data) 0.561  0.315 0.641
3.15 My friends say I often think in unusual or creative ways to solve problems 0.676 0.596
3.17 I get my ‘lefts’ and ‘rights’ easily mixed up 0.671 0.399 0.697
3.18 My tutors often tell me that my essays or assignments are confusing to read  0.427 0.663 0.685
3.11 When I’m planning my work I use diagrams or mindmaps rather than lists or bullet points 0.543 0.561
3.10 In my writing at school, I often mixed up similar letters like ‘b’ and ‘d’ or ‘p’ and ‘q’  0.432 0.521 0.553
3.19 I get in a muddle when I’m searching for learning resources or information 0.479 0.508 0.335 0.673
3.16 I find it really challenging to make sense of a list of instructions 0.369 0.464 0.406 0.686
3.05 I think I am a highly organized learner -0.789 0.568
3.03 I find it very challenging to manage my time efficiently 0.786 0.519
3.07 I generally remember appointments and arrive on time -0.602 0.351 0.654
3.14 I prefer looking at the ‘big picture’ rather than focusing on the details 0.820 0.623
3.04 I can explain things to people much more easily verbally than in my writing  0.353 0.617 0.613
3.13 I find following directions to get to places quite straightforward -0.764 0.710
3.12 I’m hopeless at remembering things like telephone numbers  0.398 0.530 0.573

This factor analysis seems reasonable and so for the moment at least, I am going to stick with it, which also means that I will revert back to the full 20-item scale for Dyslexia Index.  However, at a later point I will explore a PCA through SPSS for the reduced, 16-item scale to determine the impact that this has on identifying factors and the further implications of this when connecting the results to Academic Behavioural Confidence.

 

Factor analysis of Academic Behavioural Confidence

Sander & Sanders’ (2009) paper indicates their revision of the original, 24-item Academic Behavioural Confidence Scale into a reduced, 17-item scale based on principal component analysis executed on the collected data of their previous studies. This produced a combined datapool (n=865) of undergraduate responses to the ABC scale. It was suggested that the original, Academic Confidence Scale consisted of 6 factors: Grades, Studying, Verbalising, Attendance, Understanding and Requesting. The revised scale was also renamed to include ‘Behavioural’ to acknowledge the focus on confidence in actions and plans related to academic study (Sander & Sanders, 2006). From the data I have collected in my project, I have explored the outputs that are generated from both scales, the original 24-item, and the later 17-item metrics. The table below presents these outputs for comparison:


Research Group
Research subgroup n ABC24 mean ABC24 sd ABC17 mean ABC17 sd
DI DI-600 (Dx > 592.5) 47 57.89 15.24 57.49 15.75
Hedges’ g effect size / Student’s t-test p-value: g = 0.483 / p = 0.041 g = 0.521 / p = 0.032
ND DNI (Dx > 592.5) 18 64.92 12.43 65.24 12.26
ND-400(Dx < 400) 44 72.15 12.35 72.25 12.66

This reveals little difference between the mean ABC24 and mean ABC17 values for any of the research subgroups, showing that a slightly greater effect size is generated using the 17-point ABC Scale, this being between the sample means for research subgroups DI-600 and DNI. In both cases (ABC24 and ABC17) Student’s t-test reveals that a significant difference (p < 0.05) is present between the sample means (one-tail test, 5% level). It hasn’t escaped my attention that it might be useful to determine whether this difference in effect sizes is significant or not, and hence whether to stick with the 24-point scale or use the revised 17-point one for my enquiry. I am still to figure out how to do this, especially since the distribution of effect sizes is unknown, and so it may be neither appropriate nor statistically robust to try to establish whether a significant difference between the values exists – more on this later. However in the meantime, I do know how to calculate a confidence interval for the population Cohen’s ‘d’ effect size, ‘δ’, from which I may be able to establish whether these two effect sizes are in fact (statistically at least) the same. Cohen’s ‘d’ produces a slightly different effect size to Hedges’ ‘g’ with the latter being more appropriate when the sample sizes are substantially different, as mine are. The Confidence Interval calculation process that I can access (Cumming, 2012) only calculates the CI for ‘d’. For the data provided above, the confidence interval when using ABC24 emerged as -0.068 < δ < 1.032, and for ABC17, -0.032 < δ < 1.070 which seems to be suggesting that to all intents and purposes, the difference in effect sizes when using ABC24 compared with using ABC17 is marginal.

In any case, even though Sander & Sanders’ claim that the criterion validity of the ABC Scale is enhanced through their factor analysis procedure and the subsequent reduction into a 17-point scale, I felt that a similar PCA should be conducted on my own datapool – which is of a reasonable size (n=166) – to explore the factor structure that emerges and which might be more relevant to my own analysis.

The outcome established through several work-ups in SPSS is a reasonable, 5-factor structure that makes good sense in the context of my enquiry:

Rotated Component Matrix for Academic Behavioural Confidence 24-point scaleprintericon32
item #  item statement Factor Communalities
ABC 1 2 3 4 5 Extraction
study efficacy engagement academic output attendance debating
121  – plan appropriate revision schedules 0.809  0.761
101  – study effectively in independent study 0.703  0.637
104  – manage workload to meet deadlines 0.695  0.593
113  – prepare thoroughly for tutorials 0.665  0.578
122  – remain adequately motivated throughout my time at university 0.639  0.555
 119  – make the most of university study opportunities 0.637 0.570
114  – read recommended background material 0.602 0.318 0.530
103  – respond to lecturers’ questions in a full lecture theatre 0.799  0.662
110  – ask lecturers questions during a lecture 0.774  0.707
112  – follow themes and debates in lectures 0.654  0.610
105  – present to a small group of peers 0.624  0.483
102  – produce your best work in exams 0.605 0.444  0.692
111  – understand material discussed with lecturers 0.597  0.516
117  – ask for help if you don’t understand 0.454  0.406
116  – write in an appropriate style 0.819  0.736
115  – produce coursework at the required standard 0.814  0.805
107  – attain good grades 0.383 0.740  0.740
120  – pass assessments at the first attempt 0.696  0.593
123  – produce best work in coursework assignments 0.492 0.511 0.344  0.649
106  – attend most taught sessions 0.812  0.739
124  – attend tutorials 0.772  0.675
118  – be on time for lectures 0.676  0.522
108  – debate academically with peers 0.435 0.640  0.652
109  – ask lecturers questions in one-one settings 0.321 0.346 0.632  0.624

 

Although this factor analysis is valuable, it nevertheless reveals that the dimensional structure for ABC24 for my data is also not as ‘simple’ as I would have wished with some dimensions loading onto more than one factor – just as the PCA for Dyslexia Index also revealed (above). However, grouping dimensions into factors 1 – 5 as shown above makes sense in the context of this enquiry and leads me to establish the 5 factors as:

It is useful to compare my factor analysis with Sander & Sanders (2009) reproduced below, where I have also compared the grouping of dimensions that emerged from their PCA to my own, indicated by what I have termed the ‘closest map’. This is where dimensions from both the S&S PCA and my own PCA result in similar dimensional groupings. I have,of course, had to revert back to Sander & Sanders original 24-item scale to make this comparison.

Rotated Component Matrix for Academic Behavioural Confidence 24-point scale
(adapted from: Sander & Sanders, 2009, p25)printericon32
item #  item statement Factor
ABC 1 2 3 4 5 6
 Sander & Sanders’ factor designations: studying verbalising grades attendance understanding requesting
 closest map to ABC24(5) in my data: study efficacy engagement academic output attendance no mapping no mapping
121  – plan appropriate revision schedules 0.80
101  – study effectively in independent study  0.72
122  – remain adequately motivated throughout my time at university  0.62
104  – manage workload to meet deadlines  0.56
103  – respond to lecturers’ questions in a full lecture theatre 0.85
105  – present to a small group of peers  0.81
108  – debate academically with peers  0.67
110  – ask lecturers questions during a lecture  0.58
120  – pass assessment at the first attempt 0.83
115  – produce coursework at the required standard  0.74
116  – write in an appropriate style  0.67
107  – attain good grades  0.66
123  – produce best work in coursework assignments  0.55
102  – produce best work in exams  0.51
124  – attend tutorials 0.86
106  – attend most taught sessions  0.82
118  – be on time for lectures  0.40
119  – make the most of university study opportunities  0.17 0.24 0.29 0.21
113  – prepare thoroughly for tutorials  0.73
112  – follow themes and debates in lectures  0.72
111  – understand material discussed with lecturers  0.68
114  – read recommended background material  0.68
109  – ask lecturers questions in one-one settings 0.85
117  – ask for help if you don’t understand  0.83

It is of note that scale item 119 was not attributed to any of Sander & Sanders’ factors with the highest loading of just 0.29 with the factor ‘attendance’.

An attempt has been made to map these factors to those established in my own data which has been possible for the first four factors with the exception of Sander & Sanders’ (S&S) factors 5 and 6 which draw no obvious mapping to my factors. A deeper discussion about the similarities and differences between these two factor analyses will be presented in the final thesis but for now, a cursory inspection of the two tables side-by-side shows that:

The extraction commonalities was not published in Sander & Sanders 2009 paper from which this data has been drawn.

 

Interlinking Dyslexia Index and Academic Behavioural Confidence

What has emerged from the PCA on my data so far is that the structure of my metric, Dyslexia Index broadly loads onto 5 factors which I have designated as:

  1. Reading, Writing, Spelling
  2. Thinking and Processing
  3. Organization and Time-management
  4. Verbalizing and Scoping
  5. Working Memory

and that the PCA applied to data collected in my enquiry on Sander & Sanders full, 24-item Academic Behavioural Confidence Scale has also loaded onto 5 factors which I have designated as:

  1. Study efficacy
  2. Engagement
  3. Academic output
  4. Attendance
  5. Debating

printericon32Exploring the interrelationships between these two sets of factors has enabled a 25 x 25 cell matrix to be prepared (below) which sets out Hedges’ ‘g’ effect size and Student’s t-test p-values between the research subgroups when these are established according to each of the Dyslexia Index factors. So for example, in looking at the row of data for Dyslexia Index Factor 3: Organization and Time Management, when research subgroup DNI (from research group ND) and research subgroup DI-600 (from research group DI) are recreated using Dx Factor 3 as the sole sifting criteria, a fresh group of datasets now form each of these research subgroups and the mean average for ABC Factor 1: Study Efficacy for the respondents in these recreated subgroups shows an effect size of 0.42 supported by a significant difference between the ABC Factor 1 sample means (p=0.0299).

[Hedges ‘g’ has been used because this calculation for effect size uses a weighted mean process for pooling the standard deviations of each dataset being considered, and in the t-test, a one-tail test has been applied as in almost all cases, the mean ABC24(5) values for research subgroup DNI exceeded those for the (control) research subgroup DI-600.]

factors_summary4

It is critical to understand that Dyslexia Index Factor 3 research subgroups ND-400, DNI and DI-600 will or may contain different datasets from, for example, the Dyslexia Index Factor 2 research subgroups respectively. But this enables a deeper insight into differences in academic confidence (at a factoral level) between students with reported dyslexia and those with unreported dyslexia-like profiles as established according to one Dyslexia Index factor or another. This is quite a complex analysis process in terms of establishing which datasets appear in which subgroups but the results are illuminating not the least because it appears to be indicating that there may be merit in focusing on exploration of the enquiry datapool on a factor-by-factor basis. As such, there is more work to do here and this will be reported extensively in the final thesis write-up. The table below summarizes the relative sample sizes of the research subgroups when these are sifted according to Dyslexia Index Factor. Also shown are the sample size mean Dx values for each factor together with t-test evidence that these can be considered as not significantly different between the two research subgroups of interest (DNI and DI-600). Hence it is appropriate to consider ABC effect sizes between research subgroups on a Dx factor-by-factor basis.

samplesizes_DxFactors

Before any further discussion of these data, it should be noted that the boundary value for Dx that attributes data into the research subgroups has been amended. Whereas the designations, ‘DNI’ and ‘DI-600’ will remain unchanged in order to avoid confusion with earlier posts, the boundary value has been adjusted to Dx = 592.5. This followed from a simple, independent means t-test analysis of Dx data in each of the two subgroups, DNI and DI-600 which revealed that with the boundary value set at Dx = 600, the t-test demonstrated that the difference between the means was significant at the 5% level. It was felt therefore, that setting a boundary value of Dx = 600, principally to a) establish students with unidentified dyslexia-like profiles from research group ND and hence designate these as research subgroup DNI, and b) to establish a like-for-like control (sub)group (DI-600) of students with identified dyslexia who presented a Dx > 600 was not as robust as it could be if the mean Dx values from each of these subgroups exhibited a significant statistical difference (p = 0.048). Through iterations of the t-test, a revised boundary value of Dx = 592.5 then presented an output p-value that was not significant at the 5% level (p = 0.053), hence indicating that for analysis purposes, the mean Dx values of the two most important subgroups, DNI and DI-600, were not significantly different. This small sifting boundary adjustment attributed a slight amendment to the datasets included into the base research subgroup DNI (established using the overall Dx value rather than any factoral variation of this) whereby one additional dataset is subsequently included bring the subgroup DNI up to size n = 18, and two additional datasets increase the size of research subgroup DI-600 to n = 47.

 

Effect size between Academic Behavioural Confidence:

This matrix of effect sizes and t-test p-values is the first analysis of the ABC data on a Dyslexia Index factor-by-factor basis.

The important overall key finding is that the analysis identifies a medium effect size of 0.48 between Academic Behavioural Confidence for the key research subgroups: DNI and DI-600 (shown in the grid-sector extreme bottom-right). Taken together with the t-test for significant difference between independent sample means which returns a p-value of p=0.043 (t=1.743, one-tail test) these outcomes appear to be signalling a significant difference between the ABC sample means for these research subgroups.

By looking in more detail at the matrix of effect size and p-value results for the component analysis for both metrics (Dx and ABC) it may be possible to identify where the contributing differences between Dx and ABC for each of the subgroups lies.

For example, for respondents organized according to Dyslexia Index Factor 3: Organization and Time Management, this is then the only Dx factor ‘sift’ that presents notable effect size differences between the research subgroups DNI and DI-600 data in all five factors of Academic Behavioural Confidence. Effect size ‘g’ values range from g = 0.38 in ABC factor 5: Debating with the t-test indicating an albeit only just significant difference between the sample means (p= 0.046); to an effect size of g = 0.89 in ABC factor 2: Engagement. The t-test returned a very highly significant p-value of p=0.0001 (rounded to 4 dp, the actual p-value is p = 0.0000569). Given that effect size differences are effectively ‘one tail’, that is, are set so that a positive effect size indicates that ABC is higher for the research subgroup DNI than subgroup DI-600, these results seem to be indicating that students with reported dyslexia exhibit significantly lower levels of academic confidence when sifted according to their Organization & Time Management factor of Dyslexia Index. This might be suggesting that on the basis of this dyslexia-indicating factor at least, aspects of dyslexia support related to ameliorating apparent weaknesses in organization and time management may be less effective than might be supposed. Not knowing that you may be dyslexic appears to be better for you when it comes down to the study-skill attribute of organization and time management.

It is also highly interesting to note that for this Dx Factor 3, the effect size differences between students regarded as highly NON-dyslexic (that is, research subgroup ND-400) and the dyslexic control group are all negative. I think this is therefore demonstrating that when considering a level of dyslexia as measured through the parameter, Organization and Time Management, it is better to be a student with an unreported dyslexia-like profile than it is to be either a reported dyslexic or highly non-dyslexic. This is puzzling but may be indicating that very curiously, some the dimensions of dyslexia that constitute this factor are actually positive attributes in relation to academic confidence but only in students with (potentially) unidentified dyslexia. Clearly conclusions are in relation to this datapool of respondents and it would be inappropriate to generalize more widely, especially as research subgroup DNI is quite small (n=18).

It must be emphasized again that the Dyslexia Index factor analysis process used here does generate different cohorts of students in each research subgroup when regarding Dyslexia Index (Dx) as the independent variable – that is, the one I’ve fixed or chosen. This is because the process of considering the aggregate of the values for each of the dimensions that together constitute a factor generates a different Dyslexia Index than it might for any other factor for any specific student respondent. In other words, Student X will have a different Dx value for each Dx factor which will be different from their overall (i.e. aggregated) Dyslexia Index, and this may mean that the student is included or not in any of the research subgroups of interest, ND-400, DNI, and DI-600 on the basis of that factor, where the same student may be included or not, when generating a Dx value through one of the other Dx factors. Perhaps I should build fresh diagramatic visualizations for students to show the different Dx values they present against each Dx factor.

This point is demonstrated here:

For example, consider respondent #96408048 from research group ND who presented an overall Dyslexia Index of Dx = 604.94, hence placing this respondent just above the boundary into research subgroup: DNI – that is, students with an unreported dyslexia-like profile. The Dyslexia Index values for each of the 5 factors of Dyslexia Index for this respondents are these:

Dx overall Student respondent Dx Factor 1 Dx Factor 2 Dx Factor 3 Dx Factor 4 Dx Factor 5
Reading, Writing, Spelling Thinking & Processing Organization & Time Management Verbalizing & Scoping Working Memory
604.94 #96408048 824.11 746.99 512.26 80.00 489.51

The factor analysis reveals that this respondent’s Dyslexia Index is greater than the subgroup boundary value of Dx = 592.5 for only two of the factors. What it is interesting to note is that this respondent’s Dx values for those two factors is high, indicating that this particular individual is presenting a strongly dyslexic profile in these two areas – reading, writing, spelling, and thinking & processing – conventionally regarded throughout decades of dyslexia research with children as being key indicators of the syndrome. Reflecting on this has caused me to consider the ways in which the factor Dx values are contributing to the overall Dx value and additionally, how the factor profiles of the other respondents in research subgroup DNI (sifted according to the overall Dyslexia Index value of Dx > 592.5) compare to each other. However in this blog-post I want to focus on the discussion about the matrix of effect sizes above and so a discussion about the factor profiles of respondents in each of the previously established research subgroups DNI, DI-600 and ND-400 is presented in an alternative post and also more fully on project webpages here.

To be continued – gasp!

REFERENCES
Comrey, A.L., Lee, H.B., 1992, A first course in factor analysis, Hillside New Jersey, Erlbaum.
de la Fuente, J., Sander, P., Putwain, D., 2013, Relationship between undergraduate student confidence, approach to learning and academic performance: the role of gender, Revista de Psicodidactica, 18(2), 375-393.
de la Fuente, J., Justicia, F., Sander, P., Cardelle-Elewar, M., 2014, Personal self-regulation and regulatory teaching to predict perfornace and academic confidence: new evidence for the DEDEPRO Model, Electronic Journal of Research in Educational Psychology, 12(3), 597-620.
Dewbury, C., 2004, Statistical methods for organizational research, Abingdon, Routledge.
Hilale, D., Alexander, G., 2011, Academic Behavioral Confidence of first-entering humanities university access program students, Journal of Social Sciences, 26(3), 203-209.
Keinhuis, M., Chester, A., Wilson, P., Elgar, K., 2011, Implementing an interteaching model to increase student engagement in large classes in 2nd year psychology, Learning and Teaching Investment Fund 2010: Final Project Report,RMIT University, Melbourne.
Kline, P., 1986, A handbook of test construction: Introduction to psychometric design, London, Methuen.
Kline, P., 1994, An easy guide to factor analysis, Hove, Routledge.
Lynch, S., Webber, M., 2011, Evaluation of online interdisciplinary academic literacy learning materials, Excellence in Teaching Conference 20ll Annual Proceedings Kings Learning Institute, London.
Sander, P., Sanders, L., 2006, Understanding academic confidence, Psychology Teaching Review, 12(1), 29-39.
Sander, P., Sanders, L., 2009, Measuring academic behavioural confidence: the ABC scale revisited, Studies in Higher Education, 34(1), 19-35.
Shaukat, S., Bashir, M., 2016, University students’ academic confidence: comparison between social sciences and natural sciences disciplines, Journal of Elementary Education, 25(2), 113-123.
Willis, W.A., 2010, Background characteristics and academic factors associated with the academic behavioural confidence of international graduate students in Ohio’s public institutions, PhD Dissertation, Kent State University College, Ohio.

Leave a Reply