3 Using SnipSnip

3.1 Basic Usage

The program SnipSnip takes a PLINK binary pedigree file as input. Basic usage of the program is given by typing:

./snipsnip -o myresults.dat mydata.bed

3.2 Options

Typing snipsnip with no options will output usage details:

SnipSnip: Imputation without imputation, v1.1
---------------------------------------------
Copyright 2013 Richard Howey, GNU General Public License, v3
Institute of Genetic Medicine, Newcastle University

Usage:
  ./snipsnip [options] pedigree.bed 
 or ./snipsnip -pf parameterfile [pedigree.bed]

Options:
  -window-size n    -- fix window at n SNPS, n must be even
  -window-size-bp x -- size of window, x, in kB
  -start a          -- start analysis from SNP number a
  -start-end a b    -- start and end analysis from SNP numbers a to b
  -i file.bed       -- input binary pedigree file, file.bed
  -o results.dat    -- output results file, results.dat
  -log results.log  -- log filename, results.log
  -covar covars.dat -- covariate filename, covars.dat
  -covar-number no  -- covariate number, no
  -covar-name na    -- covariate name, na
  -linear           -- use linear regression
  -mqtv x           -- missing quantitive trait value for linear regression
  -dominant         -- use dominant correlation partner metric
  -recessive        -- use recessive correlation partner metric
  -lr               -- perform standard logistic(linear) regression tests
  -excsnp bp        -- exclude SNP(base pair bp) as partner
  -so               -- suppress output to screen

Default Options in Effect:
  -window-size 10
  -o snipsnipResults.dat

3.3 Parameter file

A parameter file, .pf, may be used with SnipSnip instead of writing all of the options on the command line. To use a parameter file simply type:

./snipsnip myparameters.pf

The parameter file should be a text file with one option written on each line. For example, to perform an analysis with a SNP window of size 12, perform test for SNPs 100 to 200, include single SNP logistic regression results, with input file mydata.bed and output file myresults.dat the file myparameters.pf would be as follows:

-window-size 12
-start-end 100 200
-lr
-i mydata.bed
-o myresults.dat

It is also possible to add comments to the file provided that the “-” character is not used, and to comment out any options by placing another character in front of any “-”. For example, the above parameter file could be edited as follows:

I will use this window size
-window-size 12

Must remember to analysis other SNPs later
-start-end 100 200

Check single SNP logistic regression results also
-lr

This is my data
-i mydata.bed

Output the results here
-o myresults.dat

When I run lots of things I will suppress the output to screen 
#-so

3.4 Input

SnipSnip takes standard PLINK binary pedigree files, .bed, as input. This requires that the corresponding .bim and .fam, files are also available. A text PLINK pedigree file, .ped, with corresponding map file, .map, may be used to create a binary file using PLINK as follows:

plink --noweb --file mydata --make-bed --out myfile

This will create the binary pedigree file, myfile.bed, map file, myfile.bim, and family file, myfile.fam required for use with SnipSnip.

3.5 Output

The main results file is given by a text file where each row gives the results for each SNP. For example, using the default options gives the follows:

SNP CHR ID BP PARTNER_ID PARTNER_BP CORRELATION SCORE FIT_STATUS CHISQ P
1 0 rs7112558 5569598 rs11038270 5572829 0.1875740 84.85855 Y 1.56046028 0.2115978
2 0 rs7123372 5569768 rs12786429 5570176 0.5597501 68.06836 Y 0.39244220 0.5310185
...

The columns of the results file, which will differ depending on the chosen options, are as follows:

Column Description
SNP The SNP number (of the anchor SNP) as it appears in file.
CHR Chromosome of the anchor SNP.
ID The name of the anchor SNP.
BP The base pair position of the anchor SNP.
PARTNER_IDThe name of the partner SNP.
PARTNER_BPThe base pair position of the partner SNP.
CORRELATION The correlation ( r^2 ) between the anchor SNP and partner SNP.
SCORE The score (0-100) between the anchor SNP and partner SNP. High scores are best.
FIT_STATUSA “Y” indicates that, yes, the model fitted with no problems. An “N” indicates that, no, the model did not fit, no doubt due to insufficient data in the cases and/or controls. A “D” indicates insufficient data in the cases and/or controls to even bother to try and fit the model.
CHISQ The \chi^2 test statistic with one degree of freedom from performing a likelihood ratio test comparing logistic regression models.
FSTAT The F test statistic with 1 and number of subjects-3 degrees of freedom from performing an F-test comparing linear regression models.
P The p-value for the test of association of the anchor SNP.
FIT_STATUS_LRThe fit status for single SNP logistic (or linear) regression at the anchor SNP.
CHISQ_LRThe \chi^2 test statistic with one degree of freedom for single SNP logistic regression at the anchor SNP.
FSTAT_LRThe F test statistic with 1 and number of subjects-3 degrees of freedom for single SNP linear regression at the anchor SNP.
P_LRThe p-value for the test of association of the anchor SNP using single SNP logistic (or linear) regression.