Statistical significance test T-test Excel T-Test La gì T-value and p-value

How do you test the significant difference between two sets of data?

Oftentimes we would want to compare sets of samples. Such comparisons include if wild-type samples have different expression compared to mutants or if healthy samples are different from disease samples in some measurable feature [blood count, gene expression, methylation of certain loci]. Since there is variability in our measurements, we need to take that into account when comparing the sets of samples. We can simply subtract the means of two samples, but given the variability of sampling, at the very least we need to decide a cutoff value for differences of means; small differences of means can be explained by random chance due to sampling. That means we need to compare the difference we get to a value that is typical to get if the difference between two group means were only due to sampling. If you followed the logic above, here we actually introduced two core ideas of something called “hypothesis testing”, which is simply using statistics to determine the probability that a given hypothesis [Ex: if two sample sets are from the same population or not] is true. Formally, expanded version of those two core ideas are as follows:

Decide on a hypothesis to test, often called the “null hypothesis” [\[H_0\]]. In our case, the hypothesis is that there is no difference between sets of samples. An “alternative hypothesis” [\[H_1\]] is that there is a difference between the samples.
Decide on a statistic to test the truth of the null hypothesis.
Calculate the statistic.
Compare it to a reference value to establish significance, the P-value. Based on that, either reject or not reject the null hypothesis, \[H_0\].

There is one intuitive way to go about this. If we believe there are no differences between samples, that means the sample labels [test vs. control or healthy vs. disease] have no meaning. So, if we randomly assign labels to the samples and calculate the difference of the means, this creates a null distribution for \[H_0\] where we can compare the real difference and measure how unlikely it is to get such a value under the expectation of the null hypothesis. We can calculate all possible permutations to calculate the null distribution. However, sometimes that is not very feasible and the equivalent approach would be generating the null distribution by taking a smaller number of random samples with shuffled group membership.

Below, we are doing this process in R. We are first simulating two samples from two different distributions. These would be equivalent to gene expression measurements obtained under different conditions. Then, we calculate the differences in the means and do the randomization procedure to get a null distribution when we assume there is no difference between samples, \[H_0\]. We then calculate how often we would get the original difference we calculated under the assumption that \[H_0\] is true. The resulting null distribution and the original value is shown in Figure 3.9.

set.seed[100]
gene1=rnorm[30,mean=4,sd=2]
gene2=rnorm[30,mean=2,sd=2]
org.diff=mean[gene1]-mean[gene2]
gene.df=data.frame[exp=c[gene1,gene2],
                  group=c[ rep["test",30],rep["control",30] ] ]


exp.null org.diff]/length[exp.null[,1]]
p.val

## [1] 0.001

After doing random permutations and getting a null distribution, it is possible to get a confidence interval for the distribution of difference in means. This is simply the \[2.5th\] and \[97.5th\] percentiles of the null distribution, and directly related to the P-value calculation above.

We can also calculate the difference between means using a t-test. Sometimes we will have too few data points in a sample to do a meaningful randomization test, also randomization takes more time than doing a t-test. This is a test that depends on the t distribution. The line of thought follows from the CLT and we can show differences in means are t distributed. There are a couple of variants of the t-test for this purpose. If we assume the population variances are equal we can use the following version

\[t = \frac{\bar {X}_1 - \bar{X}_2}{s_{X_1X_2} \cdot \sqrt{\frac{1}{n_1}+\frac{1}{n_2}}}\] where \[s_{X_1X_2} = \sqrt{\frac{[n_1-1]s_{X_1}^2+[n_2-1]s_{X_2}^2}{n_1+n_2-2}}\] In the first equation above, the quantity is t distributed with \[n_1+n_2-2\] degrees of freedom. We can calculate the quantity and then use software to look for the percentile of that value in that t distribution, which is our P-value. When we cannot assume equal variances, we use “Welch’s t-test” which is the default t-test in R and also works well when variances and the sample sizes are the same. For this test we calculate the following quantity:

\[t = \frac{\overline{X}_1 - \overline{X}_2}{s_{\overline{X}_1 - \overline{X}_2}}\] where \[s_{\overline{X}_1 - \overline{X}_2} = \sqrt{\frac{s_1^2 }{ n_1} + \frac{s_2^2 }{n_2}}\]

and the degrees of freedom equals to

\[\mathrm{d.f.} = \frac{[s_1^2/n_1 + s_2^2/n_2]^2}{[s_1^2/n_1]^2/[n_1-1] + [s_2^2/n_2]^2/[n_2-1]} \]

Luckily, R does all those calculations for us. Below we will show the use of t.test[] function in R. We will use it on the samples we simulated above.

# Welch's t-test
stats::t.test[gene1,gene2]

## 
##  Welch Two Sample t-test
## 
## data:  gene1 and gene2
## t = 3.7653, df = 47.552, p-value = 0.0004575
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
##  0.872397 2.872761
## sample estimates:
## mean of x mean of y 
##  4.057728  2.185149

# t-test with equal variance assumption
stats::t.test[gene1,gene2,var.equal=TRUE]

## 
##  Two Sample t-test
## 
## data:  gene1 and gene2
## t = 3.7653, df = 58, p-value = 0.0003905
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
##  0.8770753 2.8680832
## sample estimates:
## mean of x mean of y 
##  4.057728  2.185149

A final word on t-tests: they generally assume a population where samples coming from them have a normal distribution, however it is been shown t-test can tolerate deviations from normality, especially, when two distributions are moderately skewed in the same direction. This is due to the central limit theorem, which says that the means of samples will be distributed normally no matter the population distribution if sample sizes are large.

We should think of hypothesis testing as a non-error-free method of making decisions. There will be times when we declare something significant and accept \[H_1\] but we will be wrong. These decisions are also called “false positives” or “false discoveries”, and are also known as “type I errors”. Similarly, we can fail to reject a hypothesis when we actually should. These cases are known as “false negatives”, also known as “type II errors”.

The ratio of true negatives to the sum of true negatives and false positives [\[\frac{TN}{FP+TN}\]] is known as specificity. And we usually want to decrease the FP and get higher specificity. The ratio of true positives to the sum of true positives and false negatives [\[\frac{TP}{TP+FN}\]] is known as sensitivity. And, again, we usually want to decrease the FN and get higher sensitivity. Sensitivity is also known as the “power of a test” in the context of hypothesis testing. More powerful tests will be highly sensitive and will have fewer type II errors. For the t-test, the power is positively associated with sample size and the effect size. The larger the sample size, the smaller the standard error, and looking for the larger effect sizes will similarly increase the power.

The general summary of these different decision combinations are included in the table below.

Accept \[H_0\] [claim that the gene is not differentially expressed]True Negatives [TN]False Negatives [FN] ,type II error\[m_0\]: number of truly null hypothesesreject \[H_0\] [claim that the gene is differentially expressed]False Positives [FP] ,type I errorTrue Positives [TP]\[m-m_0\]: number of truly alternative hypotheses

We expect to make more type I errors as the number of tests increase, which means we will reject the null hypothesis by mistake. For example, if we perform a test at the 5% significance level, there is a 5% chance of incorrectly rejecting the null hypothesis if the null hypothesis is true. However, if we make 1000 tests where all null hypotheses are true for each of them, the average number of incorrect rejections is 50. And if we apply the rules of probability, there is almost a 100% chance that we will have at least one incorrect rejection. There are multiple statistical techniques to prevent this from happening. These techniques generally push the P-values obtained from multiple tests to higher values; if the individual P-value is low enough it survives this process. The simplest method is just to multiply the individual P-value [\[p_i\]] by the number of tests [\[m\]], \[m \cdot p_i\]. This is called “Bonferroni correction”. However, this is too harsh if you have thousands of tests. Other methods are developed to remedy this. Those methods rely on ranking the P-values and dividing \[m \cdot p_i\] by the rank, \[i\], :\[\frac{m \cdot p_i }{i}\], which is derived from the Benjamini–Hochberg procedure. This procedure is developed to control for “False Discovery Rate [FDR]” , which is the proportion of false positives among all significant tests. And in practical terms, we get the “FDR-adjusted P-value” from the procedure described above. This gives us an estimate of the proportion of false discoveries for a given test. To elaborate, p-value of 0.05 implies that 5% of all tests will be false positives. An FDR-adjusted p-value of 0.05 implies that 5% of significant tests will be false positives. The FDR-adjusted P-values will result in a lower number of false positives.

One final method that is also popular is called the “q-value” method and related to the method above. This procedure relies on estimating the proportion of true null hypotheses from the distribution of raw p-values and using that quantity to come up with what is called a “q-value”, which is also an FDR-adjusted P-value [Storey and Tibshirani 2003]. That can be practically defined as “the proportion of significant features that turn out to be false leads.” A q-value 0.01 would mean 1% of the tests called significant at this level will be truly null on average. Within the genomics community q-value and FDR adjusted P-value are synonymous although they can be calculated differently.

In R, the base function

p.val=sum[exp.null[,1]>org.diff]/length[exp.null[,1]]
p.val

0 implements most of the p-value correction methods described above. For the q-value, we can use the

p.val=sum[exp.null[,1]>org.diff]/length[exp.null[,1]]
p.val

1 package from Bioconductor. Below we demonstrate how to use them on a set of simulated p-values. The plot in Figure 3.10 shows that Bonferroni correction does a terrible job. FDR[BH] and q-value approach are better but, the q-value approach is more permissive than FDR[BH].

library[qvalue]
data[hedenfalk]

qvalues


				
					

                 
	Bài Viết Liên Quan
	
	 	
		
		   
		   
		   
		
		
			Which of the following is not a form of intelligence According to Gardner?

		
	

		
		
		   
		   
		   
		
		
			Why do auditors rely on substantive tests in auditing shareholders equity?

		
	

		
		
		   
		   
		   
		
		
			Are substantive analytical procedures allowed to be used as audit evidence?

		
	

		
		
		   
		   
		   
		
		
			Which type of attachment is characterized by confusion?

		
	

		
		
		   
		   
		   
		
		
			What is the distribution of K+ within the body compartments?

		
	

		
		
		   
		   
		   
		
		
			Các trang web test IELTS miễn phí

		
	

		
		
		   
		   
		   
		
		
			Cách để biết có duyên âm

		
	

		
		
		   
		   
		   
		
		
			Bị rối loạn trầm cảm phải làm sao

		
	

		
		
		   
		   
		   
		
		
			What were the enduring characteristics that describe an individual behavior?

		
	

		
		
		   
		   
		   
		
		
			Which of the following are symptoms of generalized anxiety disorder?

		
	

		
		
		   
		   
		   
		
		
			How can I improve my emotional function?

		
	

		
		
		   
		   
		   
		
		
			What five questions guide the analytics process?

		
	

		
		
		   
		   
		   
		
		
			What tests are done to check amniotic fluid?

		
	

		
		
		   
		   
		   
		
		
			What are 3 types of events are associated with post traumatic stress disorder?

		
	

		
		
		   
		   
		   
		
		
			Which of the following is a priority client teaching topic related to the ears

		
	

		
		
		   
		   
		   
		
		
			What should I monitor for low hemoglobin?

		
	

		
		
		   
		   
		   
		
		
			Which behavioral profile would be consistent with a person high in neuroticism?

		
	

		
		
		   
		   
		   
		
		
			What abnormal characteristics should be reported after obtaining a urine sample

		
	

		
		
		   
		   
		   
		
		
			Some studies suggest there may be an anxiety gene that affects brain levels of:

		
	

		
		
		   
		   
		   
		
		
			The handing over of specific tasks usually to a subordinate is the definition of

		
	

	
	




Toplist mới

 
	
	 
		#1
		
			Top 7 download sách nói kinh doanh 2023
			5 tháng trước
		
	



	
	 
		#2
		
			Top 8 bà bầu uống nước ép cà rốt tốt không 2023
			5 tháng trước
		
	



	
	 
		#3
		
			Top 1 công cụ kế hoạch hoá đối với quản lý nhà nước về thương mại 2023
			5 tháng trước
		
	



	
	 
		#4
		
			Top 8 phrasal verb unit 4 lớp 9 2023
			5 tháng trước
		
	



	
	 
		#5
		
			Top 7 ứng dung kỹ năng giao tiếp sư phạm 2023
			5 tháng trước
		
	



	
	 
		#6
		
			Top 6 5 chú tiểu bồng lai la con của ai 2023
			5 tháng trước
		
	



	
	 
		#7
		
			Top 10 khi chàng nhìn thẳng vào mắt bạn 2023
			5 tháng trước
		
	



	
	 
		#8
		
			Top 4 đồ an tốt nghiệp thiết kế đồ họa văn lang 2023
			5 tháng trước
		
	



	
	 
		#9
		
			Top 4 7 nụ cười xuân mùa 5 trực tiếp 2023
			5 tháng trước
		
	






		


	Bài mới nhất
	
	 	
		
		   
		   
		   
		
		
			Máy tính hiện 3.36 e-10 là gì năm 2024

		
	

		
		
		   
		   
		   
		
		
			Bài tập kinh tế xây dựng và dự án năm 2024

		
	

		
		
		   
		   
		   
		
		
			Điểm chuẩn vào lớp 10 chuyên chu văn an năm 2024

		
	

		
		
		   
		   
		   
		
		
			Nghị luận về 1 tác phẩm đoạn trích văn xuôi năm 2024

		
	

		
		
		   
		   
		   
		
		
			Sửa lỗi there was a problem resetting your pc năm 2024

		
	

		
		
		   
		   
		   
		
		
			Quá trình phát xít hóa ở đức và nhật bản năm 2024

		
	

		
		
		   
		   
		   
		
		
			Mua vé máy bay giá rẻ đi thanh hóa năm 2024

		
	

		
		
		   
		   
		   
		
		
			Chung cư xuân phương văn phòng quốc hội năm 2024

		
	

		
		
		   
		   
		   
		
		
			Thông tư mới nhất về xử phạt mất hóa đơn năm 2024

		
	

		
		
		   
		   
		   
		
		
			Tinh chất chống lão hóa r.n.a power essence 30ml năm 2024

		
	

	
	
                 
	Chủ Đề
	
	
	
		  Hỏi Đáp
		  Toplist
		  Là gì
		  Địa Điểm Hay
		  Mẹo Hay
		  programming
		  Học Tốt
		  mẹo hay
		  Nghĩa của từ
		  Công Nghệ
		  Bao nhiêu
		  đánh giá
		  Khỏe Đẹp
		  Top List
		  bao nhiêu
		  bao nhieu
		  hướng dẫn
		  So Sánh
		  So sánh
		  Bài tập
		  Tiếng anh
		  Xây Đựng
		  Ngôn ngữ
		  Bài Tập
		  Sản phẩm tốt
		  Ở đâu
		  Dịch 
		  Thế nào
		  Hướng dẫn
		  Đại học
		  Máy tính
		  Tại sao
		  Món Ngon
		  Khoa Học
		  Vì sao
		  Bao lâu
		  Hà Nội