Causal inference

Causal inference is the process of drawing a conclusion about a causal connection based on the conditions of the occurrence of an effect. The main difference between causal inference and inference of association is that the former analyzes the response of the effect variable when the cause is changed.^[1]^[2] The science of why things occur is called etiology.

Definition

Inferring the cause of something has been described as

"...reason[ing] to the conclusion that something is, or is likely to be, the cause of something else".^[3]
"Identification of the cause or causes of a phenomenon, by establishing covariation of cause and effect, a time-order relationship with the cause preceding the effect, and the elimination of plausible alternative causes."^[4]

Methods

Epidemiological studies employ different epidemiological methods of collecting and measuring evidence of risk factors and effect and different ways of measuring association between the two. A hypothesis is formulated, and then tested with statistical methods (see Statistical hypothesis testing). It is statistical inference that helps decide if data are due to chance, also called random variation, or indeed correlated and if so how strongly.

Common frameworks for causal inference are structural equation modeling and the Rubin causal model.

In epidemiology

Epidemiology studies patterns of health and disease in defined populations of living beings, in order to infer causes and effects. An association between an exposure to a putative risk factor and a disease may be suggestive of, but is not equivalent to causality or correlation does not imply causation. Historically, Koch's postulates have been used since the 19th century to decide if a microorganism was the cause of a disease. In the 20th century the Bradford Hill criteria, described in 1965^[5] have been used to assess causality of variables outside microbiology, although even these criteria are not exclusive ways to determine causality.

In molecular epidemiology the phenomena studied are on a molecular biology level, including genetics, where biomarkers are evidence of cause or effects.

A recent trend is to identify evidence for influence of the exposure on molecular pathology within diseased tissue or cells, in the emerging interdisciplinary field of molecular pathological epidemiology (MPE). Linking the exposure to molecular pathologic signatures of the disease can help to assess causality. Considering the inherent nature of heterogeneity of a given disease, the unique disease principle, disease phenotyping and subtyping are trends in biomedical and public health sciences, exemplified as personalized medicine and precision medicine.

In computer science

Determination of cause and effect from joint observational data for two time-independent variables, say X and Y, has been tackled using asymmetry between evidence for some model in the directions, X → Y and Y → X. One idea is to incorporate an independent noise term in the model to compare the evidences of the two directions.

Here are some of the noise models for the hypothesis Y → X with the noise E:

Additive noise:^[6] $Y = F(X)+E$
Linear noise:^[7] $Y = pX + qE$
Post-non-linear:^[8] $Y = G(F(X)+E)$
Heteroskedastic noise: $Y = F(X)+E.G(X)$
Functional noise:^[9] $Y = F(X,E)$

The common assumption in these models are:

There are no other causes of Y.
X and E have no common causes.
Distribution of cause is independent from causal mechanisms.

On an intuitive level, the idea is that the factorization of the joint distribution P(Cause,Effect) into P(Cause)*P(Effect | Cause) typically yields models of lower total complexity than the factorization into P(Effect)*P(Cause | Effect). Although the notion of “complexity” is intuitively appealing, it is not obvious how it should be precisely defined.^[9]

Education

Graduate courses on causal inference have been introduced to the curriculum of many schools.

Karolinska Institutet, Department of Medical Epidemiology and Biostatistics
University of Groningen, Department of Statistics & Measurement Theory
Harvard University, School of Public Health
University of Pennsylvania, Department of Biostatistics and Epidemiology
University of California, Los Angeles, Department of Epidemiology and Department of Computer Science
McGill University, Department of Epidemiology, Biostatistics and Occupational Health
The University of British Columbia, School of Population and Public Health

References

↑ Pearl, Judea (1 January 2009). "Causal inference in statistics: An overview" (PDF). Statistics Surveys 3: 96–146. doi:10.1214/09-SS057.
↑ Morgan, Stephen; Winship, Chris (2007). Counterfactuals and Causal inference. Cambridge University Press. ISBN 978-0-521-67193-4.
↑ "causal inference". Encyclopædia Britannica, Inc. Retrieved 24 August 2014.
↑ John Shaughnessy; Eugene Zechmeister; Jeanne Zechmeister (2000). Research Methods in Psychology. McGraw-Hill Humanities/Social Sciences/Languages. pp. Chapter 1 : Introduction. ISBN 0077825365. Retrieved 24 August 2014.
↑ Hill, Austin Bradford (1965). "The Environment and Disease: Association or Causation?". Proceedings of the Royal Society of Medicine 58 (5): 295–300. PMC 1898525. PMID 14283879.
↑ Hoyer, Patrik O., et al. "Nonlinear causal discovery with additive noise models." NIPS. Vol. 21. 2008.
↑ Shimizu, Shohei, et al. "DirectLiNGAM: A direct method for learning a linear non-Gaussian structural equation model." The Journal of Machine Learning Research 12 (2011): 1225-1248.
↑ Zhang, Kun, and Aapo Hyvärinen. "On the identifiability of the post-nonlinear causal model." Proceedings of the Twenty-Fifth Conference on Uncertainty in Artificial Intelligence. AUAI Press, 2009.
1 2 Mooij, Joris M., et al. "Probabilistic latent variable models for distinguishing between cause and effect." NIPS. 2010.

External links

This article is issued from Wikipedia - version of the Tuesday, March 01, 2016. The text is available under the Creative Commons Attribution/Share Alike but additional terms may apply for the media files.