"Similar Neighbor" Techniques in Forestry

There is a technique that has been developing in the sampling literature for a couple of years now, and perhaps it is time to discuss it in the newsletter. We don't like to jump on bandwagons too soon, but this is a sensible method, and it has been used by several different groups with success. We consider practical success more important than published papers.

One of the chief developers of this method is Al Stage. He is a forest researcher who is known for his modeling efforts, but is equally competent in sampling issues. He is one of the deeper thinkers in forestry.

Suppose you have 100 stands sampled, perhaps with 4 or 5 plots in each stand. Let's assume that they were sampled using 1/10 acre plots. On the other hand, you have 2,000 stands which have not been sampled. How do you get values for these other 2,000 stands?

A traditional method has been to assign them to strata (perhaps 5 or 10 strata) then give all the stands in that strata the same characteristics. You get those characteristics from the average of measured stands in the strata. There was only one description for each of a small number of strata. Every stand was assigned to one of this small number of strata.

This traditional strata-based approach was designed when people kept records on 4x6 index cards (the really forward thinkers used 5x8 cards). It is out of date. There is no reason why everything in a strata has to have the same answer. There is often no reason to use strata, for that matter. Let's see if we can make a reasonable case for that process using an example.

Suppose we just choose whatever of the 100 measured stands was "most similar" to one where we had no measurements. We might choose the "most similar" data set by comparing measurements we had on all the stands (elevation, aspect, slope, etc.) or we might use photos, personal knowledge, old maps or any other technique to match them. The more clever we are, the better we will match the measured and unmeasured stands.

We now transfer all the data from the measured to the unmeasured stands. The main reason for this is so we can individually grow every stand, and because we want the relationships within the data to be realistic. Using real sets of data is a good way to do this. We have 100 sets to choose from, and can even put several sets of data together in order to get intermediate values.

Soon, we have individual sets of data for every one of the 2,000 unmeasured stands. As you can imagine, there are several ways to automate the process using "similarity" defined in a number of ways. These data sets can be compiled, grown, or recomputed to different standards.

Does this set of similar plots give the correct answer? Well - what should that "correct answer" be? Suppose that the correct value using the 100 sample stands came to 20,000 BF on the average. Would you like the entire inventory to have the same average? That would seem reasonable, so let's arrange it.

First, compute the volume of the 2,000 unmeasured stands and their best estimates based on your method of matching them. Suppose they add up to an average of 24,500 BF per acre. The relative volumes of the stands are pretty good if you have matched them well, but the total is off, so we could make them all (22,000/24,500) or 89.8% of the current amounts.

This adjustment might quickly be done by adjusting the plot size. If you change the plot size from 0.1 acre to (22,000/24,500)*0.10 acre (which gives 0.0898 acres for the plot size) then the computed volumes will each change by just enough to produce the unbiased average volume per acre of 20,000. In essence, you have slightly changed the numbers of trees in each stand in order to produce the right total. All the tree sizes and the diameter distributions will be appropriate.

There are other ways that you might do this adjustment, but the important principles of the overall process are these:

You can get individual answers for each stand by substituting "most similar" matches to each stand.
You can use several methods to match up measured data with unmeasured stands.
If the overall answer does not add to the appropriate total after this matching, adjust it in some manner.

You end up with a complete description of the land base, with individual values for stands which sum to a "correct" answer. No stratification is involved. This method is a good example of distributing a known total into the individual stands that make up your land base. As computers acquire large database memories it no longer makes much sense to use one description for many stands. This is one of the methods for implementing such a procedure.