Выбор функции плотности вероятности распределения экспрессии генов при обработке данных в методе RNA-Seq
At the moment, for the equalization of reads histogram, which derived from the treatment of the transcriptome of diﬀerent individuals, it is suggested to use a negative binomial distribution. In this paper we analyze the “physical” basis of a broadening of Poisson distribution, and conclude that the true form of the distribution is really compound Poisson distribution (a special case of which is the negative binomial distribution), but the true choice is another special case of this distribution, i.e. n-times convolution (n is a random variable with Poisson distribution) of random variables with the exponential (not logarithmical) distribution. It is shown that a distribution of gene expression intensity in a group of individuals calculated from the published data is described better by the convolution of exponential functions.