### Main

2-chi-2 is a statistical method to search for interactions in binary traits. From this web page you can download a free, open-source tool, designed to perform these analyses. It is fast enough to perform GWAS studies. Furthermore different free data sets can be downloaded to test the power of 2-chi-2 as well as other statistical methods.

2-chi-2, the software, and all the data is developed at the Grup de Recerca de Reumatologia (GRR) which is a research group from the Institut de Recerca de l'Hospital Universitari Vall d'Hebron (VHIR).

### 2-chi-2 method

2-chi-2 is based on the definition of two vectors. Let us suppose we have two contingency tables each one defining the genotype distribution for cases and controls. Then, for each table we can define a nine component vector as,v

Where p_{i}=sqrt(1/(1/n_{cases}+1/n_{controls}))*(p_{i}-ep_{i})/sqrt(ep_{i})_{i}and ep

_{i}account respectively for the probability and expected probability of the table cell i whereas n

_{cases}and n

_{controls}for the number of individuals of each table. Then, the statistic is defined as the square of the length of these two vectors difference,

Σ

In the last equation v1_{i}(v1_{i}-v2_{i})^{2}_{i}and v2

_{i}represent the vector components of the cell i for case and control tables respectively. The sum is over all cells. Since each vector measures the shape and significance of the interaction on each table, their difference will measure the difference between the interaction of each table. This statistic can be generalized. Thus, the same procedure can be used to compare the statistical independence between two contingency tables of any dimension.

### 2-chi-2 software

A software to apply our statistical method to a set of SNPs can be freely downloaded from this link as precompiled binaries or as a source code. To compile the code you need a c++ compiler, the standard libraries, and Boost C++ libraries properly installed. As an input, the code use the data file formats of PLINK tool. In particular, the data sources can be given as PED files or binary PED files.The basic usage is,

2chi2 --input <file> --output <file>

The available options are,
--b | Must be used when the input file is a binary PED | |
---|---|---|

--threshold | Sets the significance threshold. Results with higher significance will be stored. (default value is 0.01) |

SNP1 | SNP2 | P | EV<5 |
---|---|---|---|

rs3748597 | rs4970405 | 1.0e-3 | 0 |

rs3748597 | rs4648764 | 4.2e-5 | 1 |

rs3748597 | rs6603811 | 7.1e-6 | 0 |

rs4970405 | rs7531583 | 1.1e-2 | 2 |

... | ... | ... | ... |

### Data for the 48 basic epistatic models

The data is divided in two files. The file models48.dat contains the definition of each epistatic model. The file data48.dat contains each generated data set obtained from different biological parameters applied to each epistatic model. These files can be downloaded here. Their format is as follows:#### models48.dat

The file has the following structure,Model | G1 | G2 | G3 | G4 | G5 | G6 | G7 | G8 | G9 |
---|---|---|---|---|---|---|---|---|---|

M1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 |

M2 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 |

M3 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 1 |

... | ... | ... | ... | ... | ... | ... | ... | ... | ... |

aa | Aa | AA | |
---|---|---|---|

bb | α | α | α |

Bb | α | α | α |

BB | α | α(1+β) | α |

#### data48.dat

The file has the following structure,Parameters | Cases | Controls | ||||||||
---|---|---|---|---|---|---|---|---|---|---|

Model | S1 | S2 | Prev | Odds | G1 | ... | G9 | G1 | ... | G9 |

M1 | 0.1 | 0.1 | 0.01 | 2 | 0.13 | ... | 0.07 | 0.03 | ... | 0.21 |

M1 | 0.1 | 0.1 | 0.01 | 3 | 0.27 | ... | 0.02 | 0.14 | ... | 0.02 |

... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... |

### Data for the 2488 extended epistatic models

The data is divided in two files. The file models2488.dat contains the definition of each epistatic model. The file data2488.dat contains each generated data set obtained from different biological parameters applied to each epistatic model. These files can be downloaded here. Their format is as follows:#### models2488.dat

The file has the following structure,Model | G1 | G2 | G3 | G4 | G5 | G6 | G7 | G8 | G9 |
---|---|---|---|---|---|---|---|---|---|

ME1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 |

ME2 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | -1 |

ME3 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 |

ME4 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 1 |

ME5 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | -1 |

... | ... | ... | ... | ... | ... | ... | ... | ... | ... |

aa | Aa | AA | |
---|---|---|---|

bb | α | α | α |

Bb | α | α | α |

BB | α | α(1+β) | αγ/(1+θ) |

#### data2488.dat

The file has the following structure,Parameters | Cases | Controls | |||||||||
---|---|---|---|---|---|---|---|---|---|---|---|

Model | S1 | S2 | Prev | R | Odds | G1 | ... | G9 | G1 | ... | G9 |

ME1 | 0.1 | 0.1 | 0.01 | 0.01 | 2 | 0.13 | ... | 0.07 | 0.03 | ... | 0.21 |

ME1 | 0.1 | 0.1 | 0.01 | 0.01 | 3 | 0.27 | ... | 0.02 | 0.14 | ... | 0.02 |

... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... |

### Download

2chi2 for Windows | Download |
---|---|

2chi2 for GNU/Linux x86 | Download |

2chi2 source | Download |

Data sets | Download |

**Note:**To compile the source you need Boost C++ libraries properly installed. After compiling, please edit the Makefile file and insert the adequate changes.

**Note:**To use the MPI parallelized version, you must have the adequate MPI libraries and you must compile from the source.