Technical Reference

This section of the documentation is provided for experienced computer users who wish to know how the data is stored in SPARKS. By knowing this you may be able to access the files from your own analysis programs to go beyond the capabilities of SPARKS. However, please keep in mind that you also have the power to destroy all your records. Make sure you know what you are doing and that you have a backup of your data. Please do not modify any SPARKS data files from outside of the SPARKS program. Doing so disables SPARKS’ powerful edit checking.

SPARKS is written in an extended version of the dBASE 3 plus language. It is compiled using WordTech’s Quicksilver version 1.3 compiler. All the data files and lookup table files, and the indexes, are standard dbase formats that may be read from any dbase program. Each studbook kept in SPARKS is stored in a separate sub-directory under the 'SPARKS sub-directory.

Data Files

Their are four main .dbf files and associated indexes for them. A complete description of them is stored in a master dictionary called metafile.dbf.

Structure for database: MASTER.DBF

Field Type Width Description
STUD_ID Character 6 KEY-Studbook Number MASTER
SEX Numeric 1 Sex MASTER
HYBRID Logical 1 Hybrid flag MASTER
DAM_ID Character 6 Dam's Studbook Number MASTER
SIRE_ID Character 6 Sire's Studbook Number MASTER
BIRTH_TYPE Character 1 Birth type MASTER
BDATE Date 8 Birth Date MASTER
BIRTH_EST Character 1 Confidence in Birth Date MASTER
REARING Character 1 How young was raised MASTER
MGMT_PLAN Logical 1 Global Management Plan MASTER
SURPLUS Logical 1 Management Plan Surplus MASTER
UPDATE Date 8 Last update to record MASTER
SEND Logical 1 Data has been changed MASTER
RSORT Character 9 Report Sort Order MASTER
SELECT1 Logical 1 Report Select Flag 1 MASTER
SELECT2 Logical 1 Report Select Flag 2 MASTER

For each studbook specimen, there is only one master record. However, in the moves file there is at least one record and often more. The key field, STUD_ID, is common to all four files and is the key that links all the data for a single studbook specimen together.

Structure for database: MOVES.DBF

Field Type Width Description
STUD_ID Character 6 KEY-Studbook Number MOVES
TRAN CODE Character 2 Type of transaction MOVES

Technical Reference 39


PHYSICAL Logical 1 Physical transfer MOVES
OWNER Logical 1 Ownership transfer MOVES
LOCATION Character 9 Physical location MOVES
LOCAL_ID Character 6 institutions local ID MOVES
TRAN_DATE Date 8 Date of transaction MOVES
TDATE_EST Character 1 Confidence in Trans Date MOVES
REM_DATE Date 8 Removal Date MOVES
RDATE_EST Character 1 Confidence in Rem Date MOVES
INSTCODE Character 9 Institution code MOVES

Structure for database: SPECIALS.DBF

Field Type Width Description
STUD_ID Character 6 KEY-Studbook Number SPECIALS
CODE Character 2 Type of Special Data SPEClALS
COMMENT Character 65 Text SPECIALS
SPEC_DATE Date 8 Date of Special Data SPECIALS

The UDF file will only contain records if the user has defined UDF’s. Again, the key field is always there, followed by any UDF fields.

Structure for database: UDF.DBF

Field Type Width Description
STUD_ID Character 6 KEY-Studbook Number UDF

Each .dbf file has an index file (.NDX) of the same name to speed retrieval. The STUD_ID field is always part of the index key field.

Master Data File...
index on Stud_ID to \&SPARKS\&STUD\master
Moves Data File...
index on Stud_ID + str(year(tran_date),4) + str(month(tran_date),2) +
str(day(tran_date),2) + tran_code to \&xSPARKS\&xSTUD\moves
Special Data File...
index on Stud_ID + str(year(spec_date),4) + str(month(spec_date),2) +
str(day(spec_date),2) to \&xSPARKS\&xSTUD\specials
User Defined File...
index on Stud_ID to \&xSPARKS\&xSTUD\UDF

Where the variable &xSPARKS is the sub-directory that the SPARKS system is installed in and &xSTUD is the name of the studbook.

Institution List

The largest data file by far is the institution lookup table with over 6250 records. This is the very important list that contains the name of most of the worlds zoos and aquariums, museums, dealers and exchanges, many individual collectors, and non-exhibit centers. There is room to store the complete address, although

Technical Reference 40


only some are provided. Each entry is assigned a mnemonic code, often a city abbreviation. It is this code that is used by SPARKS to record all locations. Without the consistency forced by SPARKS to use the same name for a location, it would not be possible to retrieve reports based upon any geographic criteria.

Structure for database: ISISISF.DBF

Field Type Width Description
INSTCODE Character 9 ISIS numeric geographic code
ISIS_MEMB Character 1 P for ISIS, A for ARKS
MNEMONIC Character 9 ISIS alpha geographic code
INST_NAME Character 40 Full institution name
ADDRESS Character 35 Institution address
CITY Character 20 City
STATE Character 20 State/provence
COUNTRY Character 20 Country
MAILCODE Character 10 Mail code / ZIP code
PHONE Character 20 Telephone number
** Total ** 185

This file also has index files, in this case two:

institution List File...
index on instcode to \&xSPARKS\ISF_CODE
index on mnemonic to \&xSPARKS\ISF_NAME

Technical Reference 41


Technical Reference 42


GENES

Software package for genetic analysis of studbook data.

Written by Robert Lacy, Chicago Zoological Society

Assuming that you have exported an appropriate pedigree data set using the Export Report from SPARKS, to run GENES, simply type GENES from the operating system, and answer the questions that appear on the screen.

GENES will, if asked politely, do:

Inbreeding calculations:
  • calculate inbreeding coefficients.
  • print out a matrix of inbreeding coefficients for hypothetical offspring that would be produced from every M x F cross of currently living animals.
  • Founder representation analysis:
  • calculate founder contributions to each living descendant, summed and average founder contributions to the living population, and the number of founder equivalents (see Lacy 1989 in Zoo Biology). The number of founder equivalents is the number of founders of equal contribution that would have yielded the diversity of founder genes that have come through the pedigree. If all founders contribute equally, the founder equivalents is the actual number (hence the name, founder equivalents). If contributions are unequal, the founder equivalents will be less.
  • Gene drop analysis of founder allele distribution:
  • a stochastic simulation of founder-allele transmission through the pedigree. The program was written by Dr. Georgina Mace, British Federation of Zoos, in FORTRAN and then translated into the C programming language and modified by Lacy. As presently dimensioned, analysis is restricted to studbooks with no more than 2000 animals with living descendants, 500 living animals, and 200 founders. Limited computer memory may further restrict these sizes. (The program will warn you if a studbook is unlikely to fit into memory. If it doesn’t fit, the program will terminate.) GENES ignores dead animals with no living descendants when running the gene drop simulation, thereby much reducing the computer memory needed and the running time.
  • In addition to the statistics calculated in the GENEDROP program written by Georgina, the GENES version also calculates:

    Target founder representations -- parity representations corrected for the irreversible loss of founder alleles that has already likely occurred in the pedigree: algorithm developed by Jon Ballou. Note that living wild-caught animals have the highest target representations, because none of their genes are yet irreversibly lost.

    Mean allelic retention -- the fraction of a founders genes that are present in at least one copy in the living descendant population.

    Founder genomes surviving -- the summed allelic retention; i.e., the number of founder alleles still in the population.

    Genetic Evaluation using GENES 43


    A brief explanatory interlude

    "Heterozygosity" is used for several different, though closely related, concepts by geneticists. Most simply, the heterozygosity of a population as the proportion of the induviduals that are heterozygous at the locus or loci of interest This is often termed the "observed heterozygosity" of a population

    In a randomly mating population (i.e., one in Hardy-Wemberg-Castle equilibrium), the mean. heterozygosity as expected to be H = 1 - sum(pi2), in which P1 m the frequency of allele i (The expected frequency of homozygotes for each allele is pi2) The heterozygosity expected under Hardy-Weinberg-Castle equihbrium often termed the "expected heterozygosity" of a population

    For many genetic ion (typically 50X to 90X), all induviduals of a population are homozygous for a single allele, x e, the locus is monomorphic In population management, as. m other evolutionary processes, such invariant loci are of relatively little interest. (Evolution requires variation) Often, we are concerned with not the absolute heterozygosity (observed or expected), but rather the heterozygosity of a population relative to the heterozygosity of some starting reference population The fractional heterozygosity is termed the "gene diversity" of a population and is sometimes symbolized P (Pi = Hi/Ho in which Pi is the gene diversity at time t, and H1 and Ho are the expected heterozygosities at times t and 0).

    Inbreeding reduces the probabibty that an individual is heterozygous at any given locus, and the inbreeding coefficient, F, of an individual is defined as the:fractional reduction of that individual’s heterozygosity (across all loci) relative to the mean expected heterozygosity of the population [F1 = (He - H1)/He, in which F1 as the inbreeding coefficient of individual i, He is the expected heterozygosity of the.population at some reference time point and H1 is the (observed) heterozygosity of individual i ]. Note that the mean inbreeding coefficient (at.time t, relative to reference time D) of a small population that is in Hardy-Weinberg-Castle equilibrium is given by F1 = (H0 - H1)/H0 = 1 - Pt.

    In the gene drop simulation is GENES (and typically in any founder analysis), the starting (observed) heterozygosity is set at 1.0, because each founder is given two unique alleles. The expected heterozygosity among the founders as 1 - sum{[ 1/(2 x Nf)]2], in which Nf is the number of founders, because Pi = 1/(2 x Nf) For reasons I won’t explain here, this expected heterozygosity of the founders is also equal to the fraction of the (expected) heterozygosity of the mid population that as expected m the founder stock (i e, the "gene diversity" of ihe founders relative to the mid population from which they came, Pf = 1 - sum{fi/(2 X Nf)]2}).

    With this clarifying (?!) background on the distinction between observed heterozygosity, expected heterozygosity, gene diversity, and inbreeding coefficients, we now continue with the output from GENES:

    Fraction of wild heterozygosity retained -- the "gene diversity" of the captive population: the expected heterozygosity in the living population relative to the wild population from which the founders were taken.

    Genetic Evaluation using GENES 44


    Fraction of wild heterozygosity lost – 1 minus the heterozygosity retained. If the population were randomly mating (few populations are), then the fraction of heterozygosity lost would be equal to the mean inbreeding coefficient of the population.

    Mean inbreeding coefficient realized – the mean inbreeding coefficient within the living descendant population. This is also equal to one minus the observed heterozygosity of the descendant population.

    Founder genome equivalents – the number of equally represented founders, with no loss of founder alleles, that would yield the amount of genetic diversity in the living descendant population. Thus, the age is that number of newly wild caught animals that would be needed to obtain the genetic diversity in the present captive population. Founder equivalents – the number of equally represented founders, with the observed losses of founder alleles, that would yield the amount of genetic diversity observed in the living descendant population. Founder equivalents do not correct for the losses of alleles in population bottlenecks, whereas founder genome equivalents do. (See Lacy 1989 paper in Zoo Biology).

    Each of the above are calculated on the total pedigree and also on a subset that excludes contributions from animals with unknown parents (which otherwise are treated as founders). Also given are the summary statistics (mean retention, heterozygosity, founder equivalents, etc.) attainable with "perfect” management in the future, i.e., if all target founder representations are met and no further allelic losses occur.

    Before running GENES, the directory should contain:

    GENES.EXE
    XXXXXXXX.TXT (your SPARKS export file from studbook xxxxxxxx)

    To this, GENES will add data matrices xxxxxxxx.rf and xxxxxxxx.des, and output files INBREED.PRN, FOUNDER.PRN, and GD.PRN. The program also creates several temporary files that will be deleted when the program terminates normally.

    The inbreeding analysis assumes that UNK and WILD parents are unrelated to all other animals – it cannot do otherwise. Thus, animals with unknown parents will be treated as wild-caught founders.

    If one parent is known (and captive), but the other parent is WILD or UNK (as would occur if a wild-caught female gave birth to an offspring sired in the wild), GENES will treat the unknown parent as a founder. The "studbook number” of that pseudo-founder is set equal to the negative of the studbook number of the known (captive) parent. (This pseudo-founder is not added to the studbook, however, it is simply assumed to exist for the genetic calculations.) If an animal gives birth to several offspring with an UNK or WILD animal for the other parent, the program assumes that the unknown (pseudo-founder) parent is the same for all those offspring. The gene drop program outputs summary statistics for the entire data set (treating unknowns as wild-caught founders) and for those only founders recorded as truly WILD. (A few statistics cannot be calculated on the subset without unknown ”founders". Those spots are left blank on the output.)

    One unknown parent causes no problem for the inbreeding calculations, beyond the obvious loss of information (and possible under-estimation of F) if the unknown parent is in fact related to other animals in the studbook.

    If none of this makes sense, try the program and see what happens.

    GENES is dimensioned to handle up to 2000 animals. Let Lacy know if you want a version with larger

    Genetic Evaluation using GENES 45


    dimensions.

    GENES can be made quite fast by running it on a RAM disk. The program will be very slow, and will handle only very small studbooks, if it is run from a floppy disk. The program will make use of a math coprocessor if the computer has one, and the program will run noticeably faster.

    GENES assumes that the studbook data are ASCII format as produced by using the SPARKS Export utility. Minimally necessary is a file containing the following fields:

    ID/Sire/Dam/Sex/Selected/Dead/NewID/NewSireID/NewDamID

    Lacy welcomes comments on GENES, to which he will respond if he has time. No guarantees of any sort are provided with this software: little effort has gone into testing and debugging. Use at your own risk.

    Genetic Evaluation using GENES 46


    [ Table of Contents | next page ]