4 Data Processing
This section explains how PseudoCons processes the pedigree data to produce the case-control output data.
By default one case/parent trio is taken from each pedigree and from this one case is taken and one pseudocontrol created. The trio chosen is simply decided by the first case in the pedigree file who also has two parents in the pedigree file.
It may be possible that there is a choice of case/parent trios from a pedigree to give the case and created pseudocontrol. For a pedigree file with many large pedigrees this could potentially alter the results of any subsequent analysis performed. For example, if pedigrees are ascertained on the basis of a particular affected child, but case/parent trios containing the parents and grandparents are chosen instead, this could then bias the analysis. With this in mind it is possible to supply an optional proband file containing a list of all the affected subjects that are of interest. The file is a list of subjects given by the pedigree number and subject number coresponding to the pedigree file given to PseudoCons. For example, a proband file may look as following:
1 4
2 5
3 2
5 12
7 3
9 3
10 2
The proband file is used in PseudoCons with the -pro option as follows:
./pseudocons -pro proband.dat -i mydata.bed -o mycasepscondata.bed
The name of the proband file should following immediately after the -pb option. The following points should be noted about proband files:
-
If a proband file is given it is not necessary to supply a subject for every pedigree. For example, for smaller pedigrees you may be happy to use the default setting.
-
The proband subjects do not need to appear in any particular order in the file.
-
If the proband subject is not affected a warning message will be displayed and the pedigree processed using the default settings.
-
If a proband subject does not exist in the pedigree file a warning message will be displayed and the pedigree file will be processed as normal.
It is possible to use all possible case/parent trios from a pedigree, counting them as if they are independent, using the -xtrio option. The trios may overlap if a parent is also a case. Depending on the analysis you want to do, this assumption may be more or less valid.
The pseudocontrols are created using the non-transmitted alleles. For example, if the alleles of the case are A/A and the alleles of the parents are A/G and A/G, then the created pseudocontrol will have alleles G/G.
The three pseudocontrols are created using any possible genotype from the parents that contains a non-transmitted alleles. For example, if the alleles of the case are A/A and the alleles of the parents are A/G and A/G, then the three created pseudocontrols will have alleles G/G, A/G and G/A.
Given two SNPs, the 15 pseudocontrols are created using any possible genotype pair from the parents that contains a non-transmitted allele.