內部一致性 (internal consistency)
*內部一致性之概念:
1. 信度指標之一。
2. 計算各題目分數間之相關 & 各題目分數與總分之相關。
3. 目的為確認量表內所有題目之同質性 (McCrae, Kurtz, Yamagata, & Terracciano, 2011; Tavakol & Dennick, 2011),驗證測驗分數在相同情境下是否受不同內容取樣(量表之項目篩選)影響 (Nunnally & Bernstein, 1994) 。傳統測驗理論假設量表的題目是從欲評量特定領域的所有可能題目中,隨機取樣選題而得 (Streiner, 2003)。因此藉由驗證量表的內部一致性,可驗證研究者於發展量表時,所選出的題目是否適合(題目是否重複,題目是否具代表性,是否可得到一致的測量結果)。
*內部一致性之價值:協助研究者確認題目取樣與題目之異質性 (McCrae et al., 2011; Nunnally & Bernstein, 1994) 。
*內部一致性之概念澄清:
1. 內部一致性 vs. 單向度 (Gardner, 1995)
單向度是內部一致性的充分但非必要條件,意即若一量表為單向度,則其內部一致性相對也會高;然而若一量表之內部一致性高,則其不一定符合單向度。如下圖,左邊代表量表之項目各自評量不同概念,其不符合單向度,內部一致性亦差;中間代表此量表為具三個向度,然而此三個向度相關性高,使得此量表分數之內部一致性亦高;右邊代表此量表為單向度,而其內部一致性亦高。
1. 信度指標之一。
2. 計算各題目分數間之相關 & 各題目分數與總分之相關。
3. 目的為確認量表內所有題目之同質性 (McCrae, Kurtz, Yamagata, & Terracciano, 2011; Tavakol & Dennick, 2011),驗證測驗分數在相同情境下是否受不同內容取樣(量表之項目篩選)影響 (Nunnally & Bernstein, 1994) 。傳統測驗理論假設量表的題目是從欲評量特定領域的所有可能題目中,隨機取樣選題而得 (Streiner, 2003)。因此藉由驗證量表的內部一致性,可驗證研究者於發展量表時,所選出的題目是否適合(題目是否重複,題目是否具代表性,是否可得到一致的測量結果)。
*內部一致性之價值:協助研究者確認題目取樣與題目之異質性 (McCrae et al., 2011; Nunnally & Bernstein, 1994) 。
*內部一致性之概念澄清:
1. 內部一致性 vs. 單向度 (Gardner, 1995)
單向度是內部一致性的充分但非必要條件,意即若一量表為單向度,則其內部一致性相對也會高;然而若一量表之內部一致性高,則其不一定符合單向度。如下圖,左邊代表量表之項目各自評量不同概念,其不符合單向度,內部一致性亦差;中間代表此量表為具三個向度,然而此三個向度相關性高,使得此量表分數之內部一致性亦高;右邊代表此量表為單向度,而其內部一致性亦高。
2. Once it is determined in one study, then you know the reliability of the scale under all circumstances? 
→A test is not reliable or unreliable. Reliability is a property of the scores on a test for a particular population of examinees (Streiner, 2003; Wilkinson, 1999).
3. A high value of α implies a high degree of internal consistency?
→α is also strongly affected by the length of the scale (Cortina, 1993; Streiner, 2003).
4. The higher the value the better?
→Higher values may reflect unnecessary duplication of content across items and point more to redundancy than to homogeneity (Streiner, 2003).
5. α always should range between 0 and 1?
→When some of the items are negatively correlated with others in the scale, α is negative (Streiner, 2003).
→A test is not reliable or unreliable. Reliability is a property of the scores on a test for a particular population of examinees (Streiner, 2003; Wilkinson, 1999).
3. A high value of α implies a high degree of internal consistency?
→α is also strongly affected by the length of the scale (Cortina, 1993; Streiner, 2003).
4. The higher the value the better?
→Higher values may reflect unnecessary duplication of content across items and point more to redundancy than to homogeneity (Streiner, 2003).
5. α always should range between 0 and 1?
→When some of the items are negatively correlated with others in the scale, α is negative (Streiner, 2003).
*內部一致性之優缺點:
1. 優點:僅需一次測驗結果即可驗證,個案與評估者負擔較少。
2. 缺點:易受單一次測量之誤差影響、不適用於速度測驗→高估、題目太難或太容易→α值高、易受量表題數影響→題數多,α值高 (Osburn, 2000; Streiner, 2003)。
*內部一致性的影響因素:(McCrae et al., 2011)
1. 優點:僅需一次測驗結果即可驗證,個案與評估者負擔較少。
2. 缺點:易受單一次測量之誤差影響、不適用於速度測驗→高估、題目太難或太容易→α值高、易受量表題數影響→題數多,α值高 (Osburn, 2000; Streiner, 2003)。
*內部一致性的影響因素:(McCrae et al., 2011)
*內部一致性之研究設計:
1. 折半法 (Split-half methods)
˙將量表施測於一組人
˙將量表隨機拆成兩半:前後折半、奇偶折半
˙求這兩半量表總分的相關
2. 內部一致性法 (internal consistency methods)
˙將量表施測於一組人
˙計算內部一致性係數
3. 樣本數應≧50人 (Javali, Gudaganavar, & Raj, 2011)
˙其它研究:需≧200 (Peterson, 1994), 300 (Nunnally, & Bernstein, 1994) 或400人 (Charter, 1999)
*驗證內部一致性之假設:
1. The parts of the measure must be equivalent
2. Errors in measurement between parts are unrelated
3. An item or half test score is a sum of its true and its error scores
*折半信度法:具有多種折半組合,故所得信度估計值非唯一
˙Pearson’s r:題數減半,數值為低估值   
˙Spearman Brown Prophecy formula:校正折半相關係數
*內部一致性法:
1. 題目間相關 (inter-item correlation)、題目與總分間相關 (item-total correlation):選題時之參考
2. 庫李信度 (Kuder-Richardson reliability):適用於二分變項
˙ :與Cronbach’s α值相同
:與Cronbach’s α值相同
 :與Cronbach’s α值相同
:與Cronbach’s α值相同
˙ :將每一個題目難度假設為相同
:將每一個題目難度假設為相同
 :將每一個題目難度假設為相同
:將每一個題目難度假設為相同
˙與KR21比,KR20為較好的信度估計值 (Traub, 1994)
3. Cronbach’s α
˙原始 α (raw alpha):二分變項或多點記分皆適用,所得數值等於所有折半信度係數之平均數
˙標準化α (standardized alpha):以SPSS分析時,結果會同時呈現raw α與standardized α。若量表以原始分數計分,選用raw α;若量表以標準分數計分,則選用standardized α。
˙分層α (stratified alpha):適用於題組或量表包含多個分測驗時,因此種量表違背「各題目之誤差零相關」的假設,Cronbach’s α會高估。但若是測驗中有多個分量表,可直接分別計算並呈現各分量表之Cronbach’s α。
*內部一致性之SPSS操作步驟:
1. 點選「分析 (Analyze)」→「尺度 (Scale)」→「信度分析 (Reliability Analysis)」
2. 進入對話框,選取欲分析的題目移至清單中
3. 點選所需的信度估計模式:「Alpha值 (Alpha)」/「折半信度 (Split-half)」
4. 點選「統計量 (Statistics)」→勾選「敘述統計量對象 (Descriptive for)」中的「項目 (item)」、「尺度 (Scale)」及「刪除項目後之量尺摘要 (Scale if item deleted)」→勾選「各分量表內項目之間 (Inter-Item)」中的「相關 (Correlations)」→點選「繼續 (Continue)」
5. 點選「確定 (OK)」
*內部一致性之數值判別標準:
1. 題目間相關:0.3-0.9;題目與總分相關:0.2- 0.9 (Fitzpatrick, Davey, Buxton, & Jones, 1998; Hobart et al., 2001) 
2. 折半信度與Cronbach’s α:
˙發展測量工具:Excellent≧0.80;Adequate 0.70 - 0.79;Poor<0.70
˙臨床或教育決策:≧0.9 (Andresen, 2000; DeVellis, 1991; Nunnally, 1978)
*內部一致性數值特性:
1. 數值應介於 0-1 
2. 若數值>0.9,代表項目重複 (Streiner, 2003; Fitzpatrick, Davey, Buxton, & Jones, 1998; Tavakol & Dennick, 2011)
3. 題數>20題,即使項目間相關小 (r=0.30),信度多可接受 (α>0.70)  (Cortina, 1993)
三、主要參考文獻
1. Andresen, E. M. (2000). Criteria for assessing the tools of disability outcomes research. Archives of Physical Medicine and Rehabilitation, 81, S15-S20. 
2. Charter, R. A. (1999). Sample Size Requirements for Precise Estimates of Reliability, Generalizability, and Validity Coefficients. Journal of Clinical and Experimental Neuropsychology, 21, 559-566.
3. Cortina, J. M. (1993). What is coefficient alpha? An examination of theory and applications. Journal of Applied Psychology, 78, 98-104.
4. DeVellis, R.F. (1991). Scale Development Theory and Applications. London: SAGE.
5. Fitzpatrick, R., Davey, C., Buxton, M. J., & Jones, D. R. (1998). Evaluating patient-based outcome measures for use in clinical trials. Health Technology Assessment, 2, i-iv, 1-74. 
6. Gardner, P. (1995). Measuring attitudes to science: Unidimensionality and internal consistency revisited. Research in Science Education, 25, 283-289. 
7. Hobart, J. C., Lamping, D. L., Freeman, J. A., Langdon, D. W., McLellan, D. L., Greenwood, R. J., & Thompson, A. J. (2001). Evidence-based measurement: which disability scale for neurologic rehabilitation? Neurology, 57, 639-644. 
8. Javali, S.B., Gudaganavar, N.V., Raj, S. M. (2011). Effect of Varying Sample Size in Estimation of Coefficients of Internal Consistency. WebmedCentral Biostatistics, 2:WMC001649
9. McCrae, R. R., Kurtz, J. E., Yamagata, S., & Terracciano, A. (2011). Internal Consistency, Retest Reliability, and Their Implications for Personality Scale Validity. Personality and Social Psychology Review, 15, 28-50.
10. Nunnally, J.C. (1978). Psychometric Theory. New York: McGraw-Hill.
10. Nunnally, J.C. (1978). Psychometric Theory. New York: McGraw-Hill.
11. Nunnally, J. C., & Bernstein, I. (1994). Psychometric theory (3rd ed.). New York, NY: McGraw-Hill.
12. Osburn, H. G. (2000). Coefficient alpha and related internal consistency reliability coefficients. Psychological Methods, 5, 343-355. 
10. Peterson, R.A. (1994) A meta-analysis of Cronbach's coefficient alpha. Journal of Consumer Research 21, 381-391.
13.Streiner, D. L. (2003). Starting at the beginning: an introduction to coefficient alpha and internal consistency. Journal of Personality Assessment, 80, 99-103. 
10. Tavakol, M., & Dennick, R. (2011). Making sense of Cronbach's alpha. International Journal of Medical Education, 2, 53-55. 
14. Traub, R.E. (1994), Reliability for the Social Sciences: Theory and Applications. Thousand Oaks CA: Sage.
15. Wilkinson, L. (1999). Statistical Methods in Psychology Journals: Guidelines and Explanations. American Psychologist, 54, 594-604.
 
*統計公式之參考文獻:
1. Spearman Brown Prophecy formula
˙Spearman, Charles, C. (1910). Correlation calculated from faulty data. British Journal of Psychology, 3, 271–295.
˙Brown, W. (1910). Some experimental results in the correlation of mental abilities. British Journal of Psychology, 3, 296–322.
2. KR20 & KR21
˙Kuder, G. F.,& Richardson, M. W. (1937). The theory of the estimation of test reliability. Psychometrika, 2, 151-160.
3. Cronbach’s α
˙Cronbach, L. (1951). Coefficient alpha and the internal structure of tests. Psychometrika, 16, 297-334.
1. α值太高,可能代表題目重複(量表超過14題或20題容易有此問題)。此時研究者可採取的解決辦法有二:刪題 與 檢視題目是否具代表性(是否有遺漏重要的概念未評估,以致量表題目重複太高)。
2. 相對於樣本的能力,量表題目太難或太容易,並不會高估α值。而是此量表應用於該族群之內部一致性本來就會高。
→這點我還不太能理解/認同,需再思考與討論。







 
looks great!
回覆刪除有些概念, for example:
回覆刪除「內部一致性」之驗證測驗分數在相同情境下是否受不同內容取樣影響 (Nunnally & Bernstein, 1994) 。
*內部一致性之價值:協助研究者確認題目取樣
有關「取樣」,宜再補充說明。
內部一致性 vs. 單向度 (Gardner, 1995),宜再補充說明。尤其3個圖之差異何在。
已補充,以藍字呈現。
刪除McCrae, R. R., Kurtz, J. E., Yamagata, S., & Terracciano, A. (2011). Internal Consistency, Retest Reliability, and Their Implications for Personality Scale Validity. Personality and Social Psychology Review, 15, 28-50. PDF 檔請寄給我。
回覆刪除老師,我已寄到您的信箱囉!
刪除