5 PseudoCons Examples

The different options of PseudoCons are demostrated using the example data set (included in the PseudoCons download) in the following examples. The example data set consists of 100 pedigrees where the first 50 are case/parent trios, the next 25 case/parent trios with an extra sibling, and the next 25 case/parent trios where the parents of the mother of the case is also included. There are 10 SNPs in the data set with allele names A and G.

5.1 One Pseudocontrol

To produce one pseudocontrol per case-parent trio type the following:

./pseudocons -i examplePsConsData.bed -o myCasePseudocontrols.bed

or

./pseudocons -pc1 -i examplePsConsData.bed -o myCasePseudocontrols.bed

This will create screen output similar to the following:

PseudoCons: pseudocontrols from pedigree data v1.0
--------------------------------------------------
Copyright 2013 Richard Howey, GNU General Public License, v3
Institute of Genetic Medicine, Newcastle University

Parameters:
Input file: examplePsConsData.bed
Output file: myCasePseudocontrols.bed
Log file: PseudoCons.log
Number of pseudocontrols per trio: 1

Number of subjects: 375
          Males: 188 (50.1333%)
          Females: 187 (49.8667%)
          Unknown sex: 0 (0%)
          Affected: 133 (35.4667%)
          Unaffected: 242 (64.5333%)
Number of SNPs: 10

Number of pedigrees: 100
Mean pedigree size: 3.75
Standard deviation of pedigree size: 0.833333

Number of trios used to create pseudocontrols: 100
Number of pedigrees with no pseudocontrols: 0

Number of cases: 100
          Males: 39 (39%)
          Females: 61 (61%)
          Unknown sex: 0 (0%)

Number of pseudocontrols: 100
          Males: 39 (39%)
          Females: 61 (61%)
          Unknown sex: 0 (0%)

Run time: less than one second

The screen output will also be saved to the log file, by default PseudoCons.log, but can be set using the -log option. The case/pseudocontrol files output are the binary pedigree plink files PLINK myCasePseudocontrols.bed, myCasePseudocontrols.bim and myCasePseudocontrols.fam. The created binary pedigree family file is as follows:

1 3 0 0 1 2
1 3-pseudo-1 0 0 1 1
2 3 0 0 2 2
2 3-pseudo-1 0 0 2 1
3 3 0 0 2 2
3 3-pseudo-1 0 0 2 1
4 3 0 0 1 2
4 3-pseudo-1 0 0 1 1
5 3 0 0 1 2
5 3-pseudo-1 0 0 1 1
6 3 0 0 1 2
6 3-pseudo-1 0 0 1 1
7 3 0 0 1 2
7 3-pseudo-1 0 0 1 1
...

The file consists of one case from each pedigree and one created pseudocontrol. The pedigree ID in the first column is repeated and the pseudocontrol subject ID is taken from the subject case ID with “pseudo-1” appended to it.

The created binary map file, myCasePseudocontrols.bim, is simply a repeat of the original binary map file since the used SNPs have not changed, which is:

1 rs1 0 1000 G A
1 rs2 0 2000 G A
1 rs3 0 3000 G A
1 rs4 0 4000 G A
2 rs5 0 10000 G A
2 rs6 0 20000 G A
2 rs7 0 30000 G A
3 rs8 0 16000 G A
3 rs9 0 32000 G A
3 rs10 0 48000 G A

5.2 Three Pseudocontrols

To produce three pseudocontrols per case-parent trio type the following:

./pseudocons -pc3 -i examplePsConsData.bed -o myCasePseudocontrols.bed

This will create screen output very similar to creating one pseudocontrol:

PseudoCons: pseudocontrols from pedigree data v1.0
--------------------------------------------------
Copyright 2013 Richard Howey, GNU General Public License, v3
Institute of Genetic Medicine, Newcastle University

Parameters:
Input file: releases/examplePsConsData.bed
Output file: myCasePseudocontrols3.bed
Log file: PseudoCons.log
Number of pseudocontrols per trio: 3

Number of subjects: 375
          Males: 188 (50.1333%)
          Females: 187 (49.8667%)
          Unknown sex: 0 (0%)
          Affected: 133 (35.4667%)
          Unaffected: 242 (64.5333%)
Number of SNPs: 10

Number of pedigrees: 100
Mean pedigree size: 3.75
Standard deviation of pedigree size: 0.833333

Number of trios used to create pseudocontrols: 100
Number of pedigrees with no pseudocontrols: 0

Number of cases: 100
          Males: 39 (39%)
          Females: 61 (61%)
          Unknown sex: 0 (0%)

Number of pseudocontrols: 300
          Males: 117 (39%)
          Females: 183 (61%)
          Unknown sex: 0 (0%)

Run time: less than one second

This time the created binary pedigree family file is as follows:

1 3 0 0 1 2
1 3-pseudo-1 0 0 1 1
1 3-pseudo-2 0 0 1 1
1 3-pseudo-3 0 0 1 1
2 3 0 0 2 2
2 3-pseudo-1 0 0 2 1
2 3-pseudo-2 0 0 2 1
2 3-pseudo-3 0 0 2 1
3 3 0 0 2 2
3 3-pseudo-1 0 0 2 1
3 3-pseudo-2 0 0 2 1
3 3-pseudo-3 0 0 2 1
4 3 0 0 1 2
4 3-pseudo-1 0 0 1 1
4 3-pseudo-2 0 0 1 1
4 3-pseudo-3 0 0 1 1
5 3 0 0 1 2
5 3-pseudo-1 0 0 1 1
5 3-pseudo-2 0 0 1 1
5 3-pseudo-3 0 0 1 1
...

The file consists of one case from each pedigree and three created pseudocontrols. The pedigree ID in the first column is repeated and the pseudocontrol subject IDs are taken from the subject case ID with “pseudo-1”, “pseudo-2” and “pseudo-3” appended to it. Note that the sex of the case is repeated in the pseudocontrols.

As before the created binary map file, .bim, is the same.

5.3 Fifteen Pseudocontrols

To produce fifteen pseudocontrols per case-parent trio based on the non-transmitted allele combinations from two SNPs type the following:

./pseudocons -pc15 -snpnames rs1 rs3 -i examplePsConsData.bed -o myCasePseudocontrols15.bed

where the option -snpnames rs1 rs3 picks the two SNPs to be consider using the SNP names. The SNPs can also be choosen by the order in which the SNPs appear in the file, so to choose the 1st and 3rd SNPs in the file type the following:

./pseudocons -pc15 -snpnos 1 3 -i examplePsConsData.bed -o myCasePseudocontrols15.bed

This will output to screen something similar to:

PseudoCons: pseudocontrols from pedigree data v1.0
--------------------------------------------------
Copyright 2013 Richard Howey, GNU General Public License, v3
Institute of Genetic Medicine, Newcastle University

Parameters:
Input file: examplePsConsData.bed
Output file: myCasePseudocontrols15.bed
Log file: PseudoCons.log
Interaction using SNP names rs1 and rs3
Number of pseudocontrols per trio: 15

Number of subjects: 375
          Males: 188 (50.1333%)
          Females: 187 (49.8667%)
          Unknown sex: 0 (0%)
          Affected: 133 (35.4667%)
          Unaffected: 242 (64.5333%)
Number of SNPs: 10

Number of pedigrees: 100
Mean pedigree size: 3.75
Standard deviation of pedigree size: 0.833333

Number of trios used to create pseudocontrols: 100
Number of pedigrees with no pseudocontrols: 0

Number of cases: 100
          Males: 39 (39%)
          Females: 61 (61%)
          Unknown sex: 0 (0%)

Number of pseudocontrols: 1500
          Males: 585 (39%)
          Females: 915 (61%)
          Unknown sex: 0 (0%)

Run time: less than one second

This time the created binary pedigree family file is as follows:

1 3 0 0 1 2
1 3-pseudo-1 0 0 1 1
1 3-pseudo-2 0 0 1 1
1 3-pseudo-3 0 0 1 1
1 3-pseudo-4 0 0 1 1
1 3-pseudo-5 0 0 1 1
1 3-pseudo-6 0 0 1 1
1 3-pseudo-7 0 0 1 1
1 3-pseudo-8 0 0 1 1
1 3-pseudo-9 0 0 1 1
1 3-pseudo-10 0 0 1 1
1 3-pseudo-11 0 0 1 1
1 3-pseudo-12 0 0 1 1
1 3-pseudo-13 0 0 1 1
1 3-pseudo-14 0 0 1 1
1 3-pseudo-15 0 0 1 1
2 3 0 0 2 2
2 3-pseudo-1 0 0 2 1
2 3-pseudo-2 0 0 2 1
2 3-pseudo-3 0 0 2 1
2 3-pseudo-4 0 0 2 1
2 3-pseudo-5 0 0 2 1
2 3-pseudo-6 0 0 2 1
2 3-pseudo-7 0 0 2 1
2 3-pseudo-8 0 0 2 1
2 3-pseudo-9 0 0 2 1
2 3-pseudo-10 0 0 2 1
2 3-pseudo-11 0 0 2 1
2 3-pseudo-12 0 0 2 1
2 3-pseudo-13 0 0 2 1
2 3-pseudo-14 0 0 2 1
2 3-pseudo-15 0 0 2 1
...

The file consists of one case from each pedigree and 15 created pseudocontrols, one for each pair of allele combinations from the two SNPs that were not transmitted. The pedigree ID in the first column is repeated and the pseudocontrol subject IDs are taken from the subject case ID with “pseudo-i” appended to it for the ith pseudocontrol. Note that the sex of the case is repeated in the pseudocontrols.

This time the created binary map file, myCasePseudocontrols15.bim, only consists of the two SNPs used to create the pseudocontrols.

1 rs1 0 1000 G A
1 rs3 0 3000 G A

The created binary pedigree, myCasePseudocontrols15.bed, also only consists of data with these two SNPs.

5.4 Proband

To choose which cases are chosen from a pedigree a proband file may be used as follows:

./pseudocons  -pro proband.dat -i examplePsConsData.bed -o myCasePseudocontrolsPro.bed

The proband file is a list of pedigree IDs and subject IDs. The example proband file is as follows:

1 3
2 3
3 3
4 3
5 3
...
73 3
74 3
75 3
76 5
77 5
78 5
...
99 5
100 5

This will create screen output similar to the following:

PseudoCons: pseudocontrols from pedigree data v1.0
--------------------------------------------------
Copyright 2013 Richard Howey, GNU General Public License, v3
Institute of Genetic Medicine, Newcastle University

Parameters:
Input file: releases/examplePsConsData.bed
Output file: myCasePseudocontrolsPro.bed
Log file: PseudoCons.log
Number of pseudo controls per trio: 1
Proband file: releases/proband.dat

Number of subjects: 375
          Males: 188 (50.1333%)
          Females: 187 (49.8667%)
          Unknown sex: 0 (0%)
          Affected: 133 (35.4667%)
          Unaffected: 242 (64.5333%)
Number of SNPs: 10

Number of pedigrees: 100
Mean pedigree size: 3.75
Standard deviation of pedigree size: 0.833333

Number of trios used to create pseudo controls: 100
Number of pedigrees with no pseudo controls: 0

Number of cases: 100
          Males: 48 (48%)
          Females: 52 (52%)
          Unknown sex: 0 (0%)

Number of pseudo controls: 100
          Males: 48 (48%)
          Females: 52 (52%)
          Unknown sex: 0 (0%)

Run time: less than one second

Note that the number of males and females are different to previous due to different cases being chosen. The sex ratio is about 1 due to the proband file ensuring that affect offspring are chosen rather than affected mothers, which is poosible for the last group of pedigrees where the parents of the mother are also included.

For more on probands see section 4.1.1.

5.5 Extra Trios

It is possible to use more than one case/parent trio from each pedigree by using the -xtrio as follows:

./pseudocons -xtrio -i examplePsConsData.bed -o myCasePseudocontrolsX.bed

This will create screen output similar to the following:

PseudoCons: pseudocontrols from pedigree data v1.0
--------------------------------------------------
Copyright 2013 Richard Howey, GNU General Public License, v3
Institute of Genetic Medicine, Newcastle University

Parameters:
Input file: releases/examplePsConsData.bed
Output file: myCasePseudocontrolsPro.bed
Log file: PseudoCons.log
Number of pseudocontrols per trio: 1
Allowing extra trios

Number of subjects: 375
          Males: 188 (50.1333%)
          Females: 187 (49.8667%)
          Unknown sex: 0 (0%)
          Affected: 133 (35.4667%)
          Unaffected: 242 (64.5333%)
Number of SNPs: 10

Number of pedigrees: 100
Mean pedigree size: 3.75
Standard deviation of pedigree size: 0.833333

Number of trios used to create pseudocontrols: 133
Number of pedigrees with no pseudocontrols: 0

Number of cases: 133
          Males: 58 (43.609%)
          Females: 75 (56.391%)
          Unknown sex: 0 (0%)

Number of pseudocontrols: 133
          Males: 58 (43.609%)
          Females: 75 (56.391%)
          Unknown sex: 0 (0%)

Run time: less than one second

Note that in this output there are 118 case/parent trios used to create the pseudocontrol data, but only 100 pedigrees. The extra 33 trios were taken from pedigrees containing more than one case/parent trio. This option will taken as many case/parent trios from one pedigree as possible, but for this example data set takes no more than 2 per pedigree. The number of pedigrees with no available case/parent trios are also reported, which for this data example data set is zero.

Extra care should be taken in interpreting any subsequent analysis using this option.