Some thoughts on software for family history.
[This article first appeared in Forum, the newsletter of the Council of Family Societies, of which the Lin(d)field One Name Group is a founder member.]
Much has been written about the use of computers in Family History, and it is probably true to say that if you were to ask 50 people for their opinion on the most efficient software and method of using it, you would obtain 50 different answers! The observations which follow inevitably represent a personal view, and are admittedly based upon limited experience of the various alternatives favoured by other family historians. Having said that, we have found a solution which serves the Lin(d)field One Name Group reasonably well, and others may be interested in the logic by which we arrived at that solution.
Many of the tasks for which the family historian uses a computer, are basically no different to those undertaken in other spheres of activity. Word processing and Desk-Top Publishing for example, are uses shared with most business operations and increasingly with leisure activity and education. Similarly, a family history society may use spreadsheets, graphics packages and databases for accounting, membership records and for presenting material in support of lectures and meetings. The Lin(d)field One Name Group has certainly used all of these to various degrees in the two and a half years since its formation. Clearly, some general purpose packages, particularly databases and word processors, may also be used to store data on individuals for the purposes of research, while spreadsheets and drawing packages may provide a method of printing and displaying family relationships.
Family history differs from other activities however, in the nature of the information stored and also in the conventions which have become accepted for its presentation. There is an obvious requirement to be able to link records on individuals and to label these linkages in terms of family relationships. The data relating to individuals may be somewhat vague, particularly with regard to dates and the spellings of names. There is generally a need to search records on the basis of particular attributes, not necessarily the names. It is on the choice of software for this complex task of recording and searching records, and presenting relationships, that opinion is most widely divided.
Some researchers advocate the use of purpose-designed genealogical packages, while others favour spreadsheets, word processors or general purpose database software, which they adapt for their particular way of working. Clearly, different solutions work for different people. The choice depends to some extent whether one is conducting a one-name study or recording a single family; it must also depend to a considerable degree on the likely numbers of individuals to be recorded. The basic requirements are probably common to most situations, but the relative importance attached to each depends on the application. The basic requirements may perhaps be summarised as follows:
- Storage: We need to store, and subsequently modify, certain items of data in respect of each individual, such as dates and places of birth, marriage and death. Additionally, we need to store textual and possibly graphical material as part of each record or associated with it.
- Search: We need to be able to search large numbers of records for individuals having a particular name, or on the basis of other attributes such as occupation or location. Ideally, the system should make an automatic search for matching records whenever we attempt to enter a new record.
- Exchange: We need to be able to exchange data with other researchers, preferably in a machine-readable form such that the recipient is not required to type in vast amounts of data.
- Linkage: We need to be able to link individual records in a family structure, such that we can move up and down through generations, and sideways between siblings.
- Presentation: We need to be able to present the data, both on screen and also as output to a printer or as a file for transfer to a word processor; this presentation needs to include both individual records and also related sets of records with relationships shown. The formats available for presentation should include as many as possible of the conventional formats used in family history, such as trees, Ahnentafel, Register format etc.
We can obviously provide the necessary storage in a word processor, using a mixture of free text and tabulated data, and most word processing packages also enable searches to be made for particular words or names. However, the wordsearch facility in a word processor usually works by moving through the text to each matching word in turn, rather than by allowing the user to choose from a list. Searching a particular field in a table is also far from straightforward, and it is even more difficult to arrange for similar entries to be displayed automatically when attempting to enter a new individual. Spreadsheets, once used mainly by accountants, have developed to include many of the facilities of a database, such as searching and sorting of records. As such they may have something to offer the family historian, provided that the number of records fits within the permitted size of the spreadsheet. However, they are not generally designed to show linkages between sets of data, at least not in terms of hierarchical or family relationships.
Some family historians use spreadsheets to draw conventional family trees, by exploiting the cellular structure of the spreadsheet to combine lines and blocks of text. This may be quite effective for small sections of a family, but must be very difficult to modify when new branches are discovered in the earliest generations. It is also restricted to the one output format and does not allow data to be exchanged other than in eye-readable form.
Graphics packages, including CAD (Computer Aided Design) software, offer perhaps the most flexibility in printing out drop-line trees and other graphical material, but offer virtually nothing in the areas of data storage, search and exchange.
Given the inherent problems of keeping the same data in several different forms, in seems to me that the advantages of concentrating everything in a single package, far outweigh any disadvantages. In view of the limitations of word processing, spreadsheet and graphics software, the most logical choice seems to be a database of some sort. Whilst general purpose databases may have some facilities for linking records, and have unrivalled capacity for storage and search, I believe that purpose-designed genealogical software has the edge in terms of presentation facilities and information exchange. Whilst it may be possible to set up a general purpose database to print data in the various standard formats, and even to format the data in GEDCOM format for sending to other researchers, it hardly seems worth the effort when genealogical packages offer these features as standard. The strengths and weaknesses of the various approaches may be summarised as follows:
|Type of package||Storage||Search||Exchange||Linkage||Output|
|Word processor||Fair but unstructured||Limited||Very limited||Poor||Limited|
|Spreadsheet||Good but size may be limited||Fair||Poor||Poor||Limited|
|Spreadsheet – tree format||Limited especially text & pictures||Limited||Poor||Difficult to modify||Limited|
|Graphics / CAD package||Limited especially text||Poor if any||Poor||Difficult to modify||Good except for text based|
|General purpose database||Good||Excellent||Good||Good||Graphics very limited|
I have seen it suggested recently that for a one-name study, it is desirable to use a general purpose database for listing all the isolated occurrences of the name, and a genealogy package for showing only those individuals who have been connected into particular branches. Personally, I can see no advantage in this approach and have always found it much more convenient to have all the data stored in a single set of files. We store individual records, such as telephone directory entries and unconnected birth registrations, using the individual add facility in Brothers Keeper.
Of all the various output formats available, I find Register format the most useful, in that it allows all the data to be printed in a manageable form, without making large inroads into the already depleted rain forests. If all the additional information and text files are included, Register format fulfils most of the functions of a series of Group Sheets and provides a complete genealogy in a single document. Indeed, many published genealogies consist of nothing but the Register format listing, interspersed with biographical notes and comments. Since BK has the option of printing all the notes and text files associated with an individual record in a Register printout, it is possible to store all the material necessary to write a complete genealogy and to print it out whenever required.
I was asked recently for some tips on using Brothers Keeper, and I have included these as they illustrate many of the ways in which we use that particular software. Many of these are of course possible in other genealogical programs and may therefore be of wider interest.
- Always enter something in the birth date field, even if only the century in which the birth probably occurred. There is nothing worse than browsing through a list of names, half of which give no clue as to date!
- Use standard spellings, particularly of surnames. Where a number of variants exist, try to reduce the number to manageable proportions by selecting, say, 5 standard spellings, and then using the notes to give the actual spelling encountered in particular records. For example, we group all variants into LINDFIELD, LINFIELD, LINKFIELD, LINGFIELD and LINVILLE, and standardise all surnames to one of these. This also saves having to agonise over which one to use in the name field when several different sources have different spellings for the same person.
- Use the audit facility. BK has the option to maintain an audit file showing the date of all changes to field and the content of the field before and after the change. As yet, there is no facility to record changes in the notes and source fields, so I usually record corrections with the original and new source. For example, a source for a birth date might read “Prev shown as abt 1821 from census 1851; actual date from death cert”.
- Enter every new piece of information in the database somewhere. If it cannot be associated with an existing record with a high degree of certainty, note it as unproved, using question marks in date and location fields where appropriate. At least the record will show in any search you make for that location. If it is not possible to associate the new information with one, or at least a small group of records, enter a new ‘person’ as a carrier for the information. It is very easy to combine records later when duplication becomes apparent.
- Decide on standard formats at the outset. For example, you may wish to have all surnames in capitals. I did not bother with this originally, and had built up a database of some 5000 records before importing several thousand more from the IGI from the computers at Salt Lake City. These records use capitals for surnames, so I then had a mixture, which looks untidy when printed out. I am now slowly changing all the early records to capitals and wishing I had done so in the first place! Use standard abbreviations for common words (PROBably, POSSibly etc) in order to save time and disc space, and to facilitate searching. Use abbreviations for the common sources, which consist of letters/numbers not commonly found in the English language. For example, opcs/b for St Catherine’s birth references (from Office of Populations Census etc); CR51, CR61 etc for census references. These are a useful basis for searching, and allow all records having, say, an 1851 census entry, to be listed and indexed by BK automatically.
- In general, I would advise not to re-use numbers in the database. Doing so causes confusion when someone to whom you sent a printout several years ago, writes with a question referring to a particular record number. If that number has since been used for someone totally different, it can be difficult to work out who they are referring to. We leave duplicate records in the database, suitably marked as such. (We use DUP followed by the alternative record number as an entry in the reference field.)
- Use —– —– for people whose names you do not yet know, rather than *unknown which causes that person not to print out on any records. The advantage of —– —– is that you can still enter data, however vague, and also it is easier to modify when an actual name is known. If you use *unknown you have to go into add mode to give that person a name, and this can be fiddly when children are already attached to the other parent.
- If in doubt, use a new record for each person to whom you find a reference. In this way, any subsequent attempt to enter the same name will cause BK to present a list of people of that name already in the database. Witnesses to marriages, for example, may well turn up more than once and it is useful to associate those marriages. It may allow you to deduce a connection between people who had the same witnesses, and of course someone who witnessed a marriage in one year might have been engaged to a brother or sister of the bride or groom, and may well turn up later as a bride or groom in another record. Witnesses at Quaker marriages are particularly useful in that the order in which they are listed reveals the closeness of their relationship to the happy couple.
- Do regular housekeeping exercises on your data. For example, you might choose a county or place in which you know there to be a number of births or marriages, and print out a list (using the wordsearch facility) of all records having a reference to that place. I find it helpful then to note these on a large piece of paper, and then to attempt to group them by village, names of parents, and period. This is one exercise in which good old fashioned paper, together with the Mark One human brain, wins hands down over the computer! Brothers Keeper is very effective for storing and searching, but is not yet clever enough to deduce connections on the basis of geography or naming patterns. Even if any connections are somewhat circumstantial, I find it helpful then to link parents and children speculatively, so that in any future search all the possible linkages are presented for consideration. Naturally, care must be taken to add a conspicuous and emphatic note to the effect that such relationships are not proved!
- Use the database for absolutely everything possible (see notes).
Any text document such as a will or a newspaper report, can be attached to a record as a text file and will then print out on register format and other reports. Any graphics, such as maps, drawings or photographs, can be attached as .PCX format graphics files and displayed from within BK.
I use BK as an address book for
other surname researchers; anyone researching the name CLIFFORD for example is entered as RESEARCHER CLIFFORD. If I then enter ? CLIFFORD as the name in the modify screen, followed by F8 to search, I can browse through a display showing all CLIFFORD surnames in the database together with those people (not necessarily called Clifford) who are researching the name. If on the other hand I merely want to see who is researching the name, I enter RESEA CLIFFORD. (BK only searches on the first 5 letters of each name.)
I also use the database for addresses of libraries and Family History Groups to whom we send our newsletter. This allows us to print a complete set of mailing labels from BK and also to search on names and areas of interest. Searching for all records containing the place BRAINTREE for example, will give me a list of everyone who has lived in Braintree, but will also include Braintree Historical Society .
Membership records for the One Name Group are similarly stored in the database. Members are entered as individuals, with the connections to their respective ancestors, and a label added in the reference field which includes membership number and the latest year for which their subscription is paid. I can therefore assemble a list of current members by searching the reference field.