3 Using CASSI

Basic usage of CASSI is to provide it with one binary PLINK format pedigree file:

./cassi myfile.bed

This requires that the corresponding .bim and .fam, files are also available. A text PLINK pedigree file, .ped, with corresponding map file, .map, may be used to create a binary file using PLINK as follows:

plink --noweb --file mydata --make-bed --out myfile

This will create the binary pedigree file, myfile.bed, map file, myfile.bim, and family file, myfile.fam required for use with CASSI.

Executing CASSI as above will perform SNP interaction tests for every pair of distinct SNPs in the .bed file using the default options. The results file will record every pair of SNPs that satisfy a given significance level with extra information for the performed test. A log file is also created recording the same information that is output to the screen, showing the used options and summary statistics of the data. It is unlikely that you will want to use the default options and typing ./cassi with no options will output the available options.

CASSI is executed as follows:

 ./cassi [options] file.bed

or

./cassi parameterfile.pf [file.bed]

3.1 Options

The basic options for CASSI are as follows:

Option Description
-snp1 a1 a2 First SNP window, a1 = Start SNP number, a2 = End SNP number
-snp2 b1 b2 Second SNP window, b1 = Start SNP number, b2 = End SNP number
-i file.bed Input file
-i2 file2.bed Second (optional) input file for second SNP window
-o file.out Results output file
-log file.log Log file
-max m Maximum number of results, to safeguard accidently outputting half a trillion results. Set to 0 for no maximum at your own risk!
-so suppress output to screen

The SNP numbers “a1” and “a2” etc. refer to the position the SNP appears in the map file (.bim).

For example, to use these options to analyse SNPs from SNP number 1 to SNP number 60 against SNPs from SNP number 50 to SNP number 100 using binary pedigree file mydata.bed type the following:

./cassi -snp1 1 60 -snp2 50 100 mydata.bed

This will output details of the analysis and will look something like the following:

CASSI: SNP interaction analysis software, v1.0
----------------------------------------------
Copyright 2012 Richard Howey, GNU General Public License, v3
Institute of Genetic Medicine, Newcastle University

Parameters:
Input file: mydata.bed
Output file: cassi.out
Start SNP of first SNP window: 1
End SNP of first SNP window: 60
Start SNP of second SNP window: 50
End SNP of second SNP window: 100
Maximum no. of results: 1000000

Test Statistic: Joint Effects
P-value threshold for case/control results: 0.0001
P-value threshold for case only results: 0.0001

Data Summary Statistics:
Number of SNPs: 100
Number of subjects: 4686
Number of cases: 1748 (37.3026%)
Number of controls: 2938 (62.6974%)

Number of results found: 80

Run time: less than one second

To do the above analysis where the second SNP file is given in a different pedigree file type the following:

./cassi -snp1 1 60 -snp2 50 100 -i mydata.bed -i2 mydata2.bed 

Options that are specific to the joint effects test are:

Option Description
-thcc t P-value threshold for case/control test (set to 0 for no output)
-thco t P-value threshold for case only test (set to 0 for no output)
-th t P-value threshold for either test (set to 0 for no output)

These options set the p-value thresholds for the joints effects SNP interaction test to give a result that is significant enough to record in the results file. For more details see section 4

The default options for CASSI are:

Option Description
-snp1 a1 a2 All SNPs in pedigree file
-snp2 b1 b2 All SNPs in pedigree file (or 2nd pedigree file if given)
-o cassi.out
-log cassi.log
-th 0.0001
-max m 1000000 (10^6)

3.2 Parameter file

A parameter file, .pf, may be used with CASSI instead of writing all of the options on the command line. To use a parameter file simply type:

./cassi myparameters.pf

The parameter file should be a text file with one option written on each line. For example, to perform the analysis above the file myparameters.pf would be as follows:

-snp1 1 60
-snp2 50 100
-i mydata.bed

It is also possible to add comments to the file provided that the “-” character is not used, and to comment out any options by placing another character in front of any “-”. For example, the above parameter file could be edited as follows to perform the next analysis given above:

This is the first SNP window
-snp1 1 60

This is the second SNP window
-snp2 50 100

This is the pedigree file for the first SNP window
-i mydata.bed

This is the pedigree file for the second SNP window
-i2 mydata2.bed 

I might try this threshold later
#-th 0.00001