目录
Type I Error
T-tests与Type I error
Confidence Interval与Type I error
针对type I error的改进措施
含义:认为treatment group和control group之间有显著不同,而实际上并没有。也叫做"false positive".
def multi_ttests(x): x0 = df[df['Group'] == 0][x] x1 = df[df['Group'] == 1][x] x2 = df[df['Group'] == 2][x] cm01 = sms.CompareMeans(sms.DescrStatsW(x0), sms.DescrStatsW(x1)) cm02 = sms.CompareMeans(sms.DescrStatsW(x0), sms.DescrStatsW(x2)) cm12 = sms.CompareMeans(sms.DescrStatsW(x1), sms.DescrStatsW(x2)) cprint(x,'red', 'on_yellow') print(cm01.ttest_ind(alternative='two-sided', usevar='pooled')) print(cm02.ttest_ind(alternative='two-sided', usevar='pooled')) print(cm12.ttest_ind(alternative='two-sided', usevar='pooled')) var = df.columns for i in range(14): multi_ttests(var[i+1])
(注意:'pooled'意味着这些组之间是equal variance的,因为我们认为treatment对这些组的variable都没有影响,那自然他们应该都是equal variance的。官方描述:If pooled
, then the standard deviation of the samples is assumed to be the same. If unequal
, then the variance of Welch ttest will be used)
发现有两条t-test的p-value<0.05, 一共允许3*14*0.05=2.1条。所以这两条可以都是type I error。
lift = 1.1 ctr0=0.5 ctrl = np.random.binomial(30, p=ctr0, size=1000) * 1.0 test = np.random.binomial(30, p=ctr0*lift, size=1000) * 1.0 cm = sms.CompareMeans(sms.DescrStatsW(test), sms.DescrStatsW(ctrl)) print(cm.tconfint_diff(alpha=0.05, alternative='two-sided', usevar='unequal')) print(cm.zconfint_diff(alpha=0.05, alternative='two-sided', usevar='unequal'))
如上述代码所示,我们自己建立了ctrl和test两个group,得到confidence interval的计算结果如下:
def multi_CI(x): x0 = df[df['Group'] == 0][x] x1 = df[df['Group'] == 1][x] x2 = df[df['Group'] == 2][x] cm01 = sms.CompareMeans(sms.DescrStatsW(x0), sms.DescrStatsW(x1)) cm02 = sms.CompareMeans(sms.DescrStatsW(x0), sms.DescrStatsW(x2)) cm12 = sms.CompareMeans(sms.DescrStatsW(x1), sms.DescrStatsW(x2)) cprint(x,'red', 'on_yellow') print(cm01.zconfint_diff(alpha=0.05, alternative='two-sided', usevar='pooled')) print(cm02.zconfint_diff(alpha=0.05, alternative='two-sided', usevar='pooled')) print(cm12.zconfint_diff(alpha=0.05, alternative='two-sided', usevar='pooled')) for i in range(14): multi_CI(var[i+1])
为了减少type I error,我们可以降低alpha的值,比如从5%降低到1%,这样这些t-test中的p-value<0.01的肯定比<0.05的要少,甚至没有p-value<0.01的,这样就消除了type I error了。confidence interval也同理。