Background: The IPHC staff had proposed a thorough peer review of its new stock assessment method for the spring of 1998. Because of the dramatic changes implied by the results, the review was moved up to the week of September 29, 1997. The three panel members were all senior assessment scientists: Dr. Joseph Horwood of the fisheries laboratory at Lowestoft in the United Kingdom, Dr. Victor Restrepo of the University of Miami and the National Marine Fisheries Service, and Mr. Stephen Smith of the Department of Fisheries and Oceans in Halifax. Below is the report submitted by the panel.

 

 

SCIENTIFIC PEER REVIEW OF

PACIFIC HALIBUT STOCK ASSESSMENT

 

INTRODUCTION

A scientific peer Review Group met at the International Pacific Halibut Commission (IPHC), Seattle, Washington, over the period 29 September to 2 October 1997. The group comprised Dr J Horwood (Chair), Dr V Restrepo and Mr S J Smith.

The Terms of Reference (ToR) for the Review Group were given by the IPHC as addressing:

quality and adequacy of data collection procedures,

appropriateness and adequacy of the stock assessment model,

adequacy of bycatch accounting procedure,

appropriate use of auxiliary information.

Prior to the meeting, the Review Group was provided with a Report of Assessment and Research Activities, 1996, and summary reports to the 73rd meeting of the IPHC. This material included detailed information on the data, models, assessments and management. The review process included introductory presentations, from IPHC staff, during the first day and first part of the second day. Subsequently detailed discussions were held with IPHC staff concerning data and modeling, and much additional material was provided, along with additional modeling. Feedback sessions were held on 1 and 2 October.

It was evident at an early stage that there was significant complexity in: the biology of the halibut and its environment; the development of the halibut fishery and its regulation; the collection and analysis of data; and modeling and assessments. Consequently the Review Group considered it was not possible, in the time available, to develop an adequately broad and deep understanding to enable a full critique of all aspects of the work undertaken by IPHC, and to fully address all the ToRs.

Nevertheless, the Review Group acknowledged that it had been given a substantial insight into the assessment process, and was able to make positive comments on several key issues. The Review Group paid particular attention to areas relating to data quality, especially as it related to surveys, and to modeling and assessment, bearing in mind the large revisions in the state of the stocks resulting from the recently implemented model incorporating selectivity changes. Additional considerations are summarized in Appendix I.

DATA

The IPHC sampling programs for otoliths, lengths, etc.. from the commercial fishery are quite extensive with what appear to be enviable sampling levels. Sampling is an extremely time consuming and logistically difficult activity and the commission staff should be commended for their conduct of this vital activity. Data on age, length and weight were available from both the commercial fishery and research survey data, this permitted verification that changes in growth, noted for the commercial samples, were also seen for each sex separately in samples taken from the research surveys.

There has been a great deal of research on determining bycatch levels, estimating survival of released animals and considering the effects of bycatch of sub-legal sized halibut on recruitment to all of the management areas. Estimated levels of bycatch appear to have limited impact on the assessment and management advice for halibut and consequently we turned our attention to other aspects of the assessment.

Line trawl survey data for areas 2B and 3A were made available to the Review Group to evaluate the results of the stock assessment model. Assuming constant selectivity over a range of ages, simple estimates of total mortality can be calculated from survey catch-at-age data as a check on estimates and trends coming out of the stock assessment model. These have been calculated as per the equation below for the blocks of consecutive years available and over a range of age groups. The earlier ages are not fully selected but the annual patterns appear independent of the ages selected. The total mortality can be estimated as:

, where Ni,j are the CPUE in numbers from the survey at age i and year j.

The results for the mortality estimates for area 2B for 1977/78 and 1983/84 stand out as they indicate that all cohorts increased in number in the latter year (Table 1). Thus the survey was an inconsistent indicator of year-class strength for these two years. Note that the 1983/84 period was when the J hooks were replaced by circle hooks in the survey. IPHC staff had applied a conversion factor to these data so that the whole series is in terms of circle hook gear. It may be that the negative mortality estimates in 1977/78 and 1983/84 were be due to increased catchability of the halibut in 1978 and 1984 due to more favorable weather or some other change in conditions. There was no information available to judge such causes. However, the change in 1983/84 may also be due to a greater catchability of the circle hook relative to the J hook than was captured in the conversion factor.

Table 1. Total mortality estimates from survey numbers at age. Area 2B survey.

Age

Groups

77/78 80/81 81/82 82/83 83/84 84/85 85/86 95/96
8+/9+

9+/10+

10+/11+

11+/12+

12+/13+

13+/14+

14+/15+

-0.08

-0.16

-0.22

-0.44

-0.32

-0.48

-0.57

0.66

0.73

0.72

0.72

0.66

0.58

0.65

-0.06

0.00

0.11

0.08

0.12

0.23

0.33

0.06

0.22

0.40

0.43

0.69

0.90

0.77

-0.16

-0.11

-0.07

-0.14

-0.09

-0.12

-0.20

0.33

0.43

0.46

0.43

0.48

0.40

0.18

0.66

0.63

0.53

0.54

0.62

0.52

0.67

0.17

0.30

0.33

0.33

0.41

0.43

0.46

Assuming a natural mortality of 0.2, the implied fishing mortalities in the more recent period (84 to 86 and 95/96) appear to be higher than those estimated by the new assessment model.

In the case of the area 3A survey there were four years where the total mortality estimates for all ages were negative (Table 2). Again these results indicate that the survey had problems tracking cohort strength over these years. Similar to the 2B survey, the estimates were negative for all ages in 1983/84. In addition, all of the estimates for 1978/79, 1979/80 and 1989/81 were also negative. Implied estimates for fishing mortality in the most recent years suggest that, except for 1995/96, exploitation may have been at a higher level than estimated by the stock assessment.

Table 2. Total mortality estimates from survey numbers at age. Area 3A survey.

Age

Groups

77/78 78/79 79/80 80/81 81/82 82/83 83/84 84/85 85/86 93/94 94/95 95/96
8+/9+

9+/10+

10+/11+

11+/12+

12+/13+

13+/14+

14+/15+

0.70

0.71

0.66

0.64

0.76

0.84

0.88

-0.16

-0.16

-0.13

-0.06

0.14

-0.02

0.08

-0.24

-0.26

-0.34

-0.36

-0.38

-0.53

-0.35

-0.28

-0.24

-0.20

-0.19

-0.18

-0.14

-0.13

0.02

0.18

0.29

0.41

0.58

0.75

.. 0.86

0.38

0.42

0.38

0.38

0.32

0.37

0.36

-0.26

-0.24

-0.20

-0.08

-0.05

-0.05

-0.13

0.00

0.07

0.14

0.20

0.27

0.44

0.60

0.57

0.58

0.58

0.54

0.55

0.47

0.34

0.02

0.12

0.28

0.36

0.49

0.56

0.63

-0.05

0.07

0.15

0.19

0.32

0.42

0.56

0.38

0.39

0.33

0.27

0.22

0.06

0.01

With the introduction of individual vessel quotas in area 2B (British Columbia) in 1992 and individual fishing quotas in area 3A (Gulf of Alaska) in 1995, commercial catch-rate information has become less useful as an indicator of stock abundance. Coverage of the 2B and 3A areas has been intermittent with gaps over periods of years in the survey Currently, there is a five year program of extending survey coverage from the northern California coast to the Bering Sea. Recent surveys in 2B and 3A have also extended the survey area within their respective management areas. The intermittent nature of these surveys in the past and the expanding area of the recent surveys makes it difficult to construct a consistent series with respect to area and year for this stock area. A concerted research effort will be needed to investigate ways of incorporating the additional areas into an annual survey CPUE which will be comparable over the whole time series. Incorporation of the additional survey data is preferable to continuing to just use a subset of the survey sets that cover a common area over the series.

Given the increasing importance of survey CPUE for the stock assessment process we recommend that a consistent research survey be maintained as a routine part of the annual monitoring of the halibut stock. We also encourage work on relating the NMFS trawl surveys with the IPHC set line surveys. Given the close correspondence noted between the two series on page 311 of the 1996 IPHC Report of Assessment and Research Activities, the long time series for the NMFS survey has proven to be of strategic importance.

We had concerns that the large gap in the survey between 1986 and 1993, and the possibility that the change in hook type may not have been fully accounted for, may have implications for the sensitivity of the assessment model to assumptions about these data. A number of alternate runs of the current model were run for us by IPHC staff to investigate the effect of assuming different catchabilities prior and post implementation of the circle hook, as well as removing the survey CPUE series from the model. Allowing for two different periods in catchability resulted in lower estimates for exploited biomass than the current model with and without the survey. Removal of the survey alone resulted in similarly lower estimates of exploited biomass. As a result we suggest that questions regarrding the hook changes raised from the results in Tables 1 and 2 be investigated further.

 

MODELING AND ASSESSMENT

Model population structure

The annual and developmental movements of the halibut are relatively complex. Adults from all management and assessment areas congregate to spawn on winter spawning grounds, and juvenile recruitment is dependent upon ocean transport systems. The Review Group was told that once fish had recruited to a management area then they were relatively faithful to that area.

The protection of fish on the winter spawning grounds removes one element of complexity (that of apportioning winter catch back to the management areas),, but bycatch of juveniles in northern groundfish trawl fisheries may have had an impact upon recruitment to other areas. IPHC staff have modeled the impact of this bycatch on recruitment into different management areas. There is also some indication that fishing can be concentrated on the boundary of management areas, and this may pose a problem for assessment. The Review Group did not have time to address the appropriateness of the assessment areas, but its judgment was that this was of lesser priority than other issues discussed below.

Model fit diagnostics

The Review Group examined the statistical model outputs for the various areas with the objective of evaluating the consistency of the analyses on a regional scale.

With respect to within-data-set results, it is apparent that there exist data-model inconsistencies. These are evidenced, for example, by patterns in the residuals in the fits to length-at-age data contrasted to patterns in the fits to the survey and commercial catch-at-age proportions. Currently, IPHC staff weight the various pieces of data going into the model based partly on externally computed sample variances, and partly on other ("information content") considerations. However, how much influence a particular data input has on the final results does not depend solely on these weights, but also on the model formulation which determines how the various pieces interact with each other. The Review Group felt that the approach used for weighting is sensible but, given that some choices are ultimately subjective, the Review Group also felt that more explicit sensitivity analyses would be desirable. At the request of the Review Group, IPHC staff conducted an additional run for area 3A in which the weights given to the length-at-age data were increased for both the survey and commercial data. Results of this sensitivity trial gave a lower commercial catchability and correspondingly higher estimates of exploitable biomass and recent recruitment. It also showed that the selectivity curve changed during the last decade in a manner that was different from that in the base case assessment. Thus, while limited in scope and inappropriate for suggesting a superior model formulation, the analysis demonstrated that the choices of information weighting can be important.

Another important within-data-set consideration was highlighted by IPHC staff for the analyses in 3A. This consisted of alternative model runs assuming that the survey selectivity curve was either a constant function of age or a constant function of size. The results of these two runs differed considerably, especially in terms of recent recruitment and biomass levels. Differences in results could also be expected if survey selectivity were modeled in other ways. This again highlights the interplay that exists between data inputs and model formulation. Because of the complexity and high parameterization of the model, it is unreasonable to expect that fit diagnostics alone would aid in determining whether one assumption about survey selectivity is superior to the other, or superior to competing assumptions that were not examined. Thus, like in the case of weighting, choices are made external to the estimation process and it is important to examine the sensitivity of the results to these choices.

Comparison of various base case model outputs for all areas is presented in Figure 1. The size at full selection for the commercial gears points to a possible inconsistency between how selectivity and size are perceived (and modeled) to interact, and the information contained in the data sets: For area 2AB, this size at full selection is substantially lower than it is for the other areas, and is close to the minimum size of 81 cm. Thus the argument that selectivity has changed due to changes in growth and that this has made young fish less available does not seem to hold equally for all areas. There seems to be an inconsistency between data sets.

Another peculiarity highlighted in the comparisons between data sets (Figure 1) is that of the estimated catchabilities for the commercial gear: The catchability for area 3B is about four times larger than it is for other areas. Since the CPUE data for areas 3A and 3B are roughly similar, the consequence of the difference in catchability estimates is that the biomass in area 3B is perceived to be much lower than that of area 3A. The Review Group was unable to reconcile these differences in light of all the information presented. Several aspects could play a role, notably migration between areas 3A and 3B and the lack of survey CPUE data for area 3B.

 

SYNTHESIS

The Review Group presents, for managers, this synthesis as its appreciation of the current evolution of assessments and their significance for management. Much of this is reflected in documents to the Commission from the IPHC.

The previous assessment model (CAGEAN) fitted a standard fisheries population model to data on catch-at-age and commercial catch-rates (CPUE). This model gave results which, it was argued (via a inspection of "retrospective" analyses), tended to underestimate stock biomass.

Further, a large decrease in length and weight at age is observed in recent times. It was postulated by the IPHC that this would have an effect on the selectivity of fish at age. The CAGEAN model assumed that selectivity at age remained constant over the time period of the analyses.

Driven by these two arguments the IPHC developed a new model which attempted to describe the change in selectivity-at-age over time. It was assumed that smaller fish-at-age were less susceptible to capture. Consequently a given catch-at-age indicated more fish in the sea as fish became smaller over time. It is this key feature that gives the main difference in the estimated magnitude of the population between the two models.

The Review Group identified some reasons why the new model should be used with caution.

The argument for a change in selectivity is based on a presumption, and there is limited empirical evidence for such change. Although the "retrospective" patterns from CAGEAN are far from ideal, a unique source of the patterns has not been established.

The model incorporating growth changes and changes in selectivity is highly complex and the estimation procedure has to fit a large number of parameters (about 250). This complexity makes it difficult to understand the basic behaviour of the model, and risks the model being driven by noise in the data.

Inspection of the results from the new model has revealed some peculiarities. Catchabilities (the likelihood of capture of the older ages) and selectivities vary significantly between areas. There are significant trends in other parameters described above. The result that young fish in area 2A/B are fully selected at a young age goes counter to the argument that selectivity has changed and young fish are less available.

The results from the new model thus pose taxing questions, which are elaborated further in the Advice sub-section.

For management purposes, the scientific advisers should clarify the strengths and weaknesses of alternative model approaches, including CAGEAN, data and estimation, and seek to develop advice which is robust to weaknesses in model structure and estimation.

COMMUNICATION

The Review Group was asked to comment, as possible, on communication of IPHC results.

The Review Group considered that communication to the Commission and user groups was an important consideration, especially at a time when assessment methods and results had been significantly changed.

The Review Group had received papers produced for the Commission. Many of these were very technical and the key issues can easily be buried in the complexity. At the same time, the detail produced was insufficient for a highly technical evaluation of the models and estimates.

The Review Group considers that it is likely that the Commission will have to be presented with a simpler critical presentation of the biological and fishery basis of the different models. If the IPHC staff do not consider that one model is fully appropriate, as a basis for advice, then the Commission will have to be presented with the uncertainties and the arguments that will form the basis for IPHC staff advice. Presenting the results from a number of plausible different models may help to explain the implications of the identified uncertainties.

Although the technical issues are considerable, the underlying elements driving the results and possibilities for action are relatively straightforward, and are amenable to a simple and clear presentation. This is not to say that the eventual decisions made by managers will be straightforward.

The complexity of the current model, and the general evolution of more complex models, presents a problem of communication amongst experts. A key issue is the sensitivity of the models to a large array of "what if" assumptions. Any reviewer would wish to have available such examinations which are conducted as a matter of course by IPHC staff. The orderly documentation of these examinations, as they are conducted, would aid future examinations.

THE PRECAUTIONARY APPROACH

The Precautionary Approach for fishery management (PA) is now explicitly recognized in international agreements, such as the Highly Migratory and Straddling Stocks Agreement and the FAO Code of Conduct for fishing. These agreements may or may not be binding on the IPHC, but in any case the IPHC may well wish to manage in a manner it considers consistent with the PA.

The different agreements have many elements in common, and there is a degree of formality which will make it possible for third parties to argue that a fisheries commission is or is not behaving in a manner consistent with a PA.

The IPHC staff may wish to provide advice to the IPHC which they consider is consistent with a PA.

Providing advice which is "conservative" may be precautionary, but is inadequate in itself to be considered as consistent with a PA.

It is likely that the IPHC will have to develop various fishery reference points consistent with a PA for fishery management. This may include a re-examination of appropriate measures of reproductive output including the effects of changes in sex ratios.

ADVICE TO IPHC

Based upon the above approach and cognizant of the limited nature of the review the Review Group offers the following opinions:

The model development by IPHC staff has been innovative and fully appropriate in responding to concerns about their original model. However the new model is extremely complex in both its structure and estimation procedures. There appear evident concerns about the model's results as they indicate between and within area differences that are difficult to reconcile. Consequently the new model should be used with caution.

As a matter of routine simple parameter-parsimonious models should be used alongside more sophisticated models.

It follows that the IPHC staff need, as a matter of urgency, to develop a form of advice that adequately reflects, and is robust to, the current weaknesses in the alternative model approaches.

A key issue is the assumption that selectivity has changed with size or age. It is difficult to envisage that the new model can be fully accepted without external validation, by experiment or otherwise, of this key assumption.

The survey data have been of great importance in assisting to validate the models and in increasing confidence in the unbiased nature of estimates. Commercial CPUE data suffer the problem that fishers are continually changing their mode of operation to management and other drivers which distort the interpretation of CPUE; IVQs are such an example. The development and consistent continuation of designed surveys will be of great importance in the provision of advice.

The model behaviour appears particularly poor in Area 4 and alternative assessment methods should be used in at least that area.

Comments on the development of a more formal Precautionary Approach for fishery management and communication of results are given above.

ACKNOWLEDGMENTS

The Review Group acknowledges the ready assistance provided by the IPHC staff without which it would not have been possible to conduct this review. We particularly thank: Dr. W Clark, Dr.. A Parma and Dr.. P Sullivan.

J Horwood

V Restrepo

S Smith

23 October 1997

 

APPENDIX I

Additional Information Considered by the Review Group (RG)

The IPHC staff provided additional information and analyses to the RG, including past assessment results, new sensitivity analyses made at the request of the RG, and detailed verbal discussion of many issues that were of interest to the RG. A description of these is not provided in the main body of the review report because the group felt this could detract from the main focal points of interest. This Appendix presents an annotated summary of some of the issues and analyses considered by the RG.

I. 1 Selectivity.

A VPA-type model ran two years ago (area 3A) tuned to survey data provided estimates of selectivity-at-age for the commercial gear, assuming that the commercial catches were exact. While the results are conditional to an unknown degree on the manner in which survey selectivity was modeled, they do suggest a decline in selectivity for younger halibut (ages 8- 9) after 1985. This would lend some (model-based) support for the modeling of growth- selectivity interactions as was done in the 1996 assessments. However, the same VPA results also suggest a dome-shaped selectivity for the older (16+) ages in the commercial catches which is counter to the asymptotic assumption made in the parameterization used for the 1996 assessment.

At the request of the RG, IPHC staff ran a modified version of the base case model (area 3A) in which the growth aspects were removed and the commercial selectivities were estimated as constrained random walks for each age. Results of this run also suggested a decrease in selectivity for ages 8-13 during the last decade. For lack of time, no further runs were made to allow for such possibilities as time trends in survey selectivity or dome-shaped commercial selectivity. It is unknown to what degree such changes may affect one's perception of trends in the vulnerability of small halibut to the commercial fishery.

The RG reiterated its belief that efforts should be made to learn about selectivity external to the assessment model.

I.2 Sexual dimorphism.

Halibut exhibit sexual dimorphism in growth, an aspect which is not explicitly incorporated in the assessments. This should not normally bias an age-based assessment like that of Pacific halibut in which ages are adequately sampled (unless there is also sexual dimorphism in availability or mortality). The 1996 assessment, however, explicitly models variability in size- age for the population. The RG did not know whether basic results would differ substantially if the assessments had attempted to incorporate sex differences; some sensitivity analyses might be profitable. A related consideration is that of dis-aggregating estimated numbers at age by sex in order to obtain more accurate estimates of trends in female spawning biomass.

I.3 Mid-1980s.

From the available documents and presentations made, it became evident that important changes took place in the mid-1980s, including: a switch from "J" hooks to "Circle" hooks in both fishery and surveys; the onset of the latest cycle in growth changes; the subsequent interruption of the surveys. The IPHC conducted appropriate experiments to estimate the fishing power of C hooks relative to J hooks, obtaining a factor of 2.2:1 which was applied to adjust the entire time series of commercial and survey CPUE before assessment. At the request of the RG, IPHC staff conducted a sensitivity run (area 3A) in which the two series were de-coupled in 1984 so that there would be four different series (with the two survey series overlapping in 1984). The differences in catchabilities estimated by this run were used to obtain a model-based adjustment factor, i.e., one that was independent of the at-sea experiments. The resulting adjustment was 20% — 50% greater. While this exercise provides no proof that the "true" adjustment should be greater than 2.2, it does suggest that certain patterns remain in the data around the mid-1980s. The results may also be interpreted as an indication that model-based estimates are not necessarily consistent with hard experimental evidence. The negative estimates of total mortality, from the survey data for this period, are also suggestive that the correction factor did not capture all of the differences between the two types of hooks.

I.4 Growth and the Environment

Several of the available documents and presentations inferred that the recent changes in growth observed in some areas are environmentally-driven, as they may have also been during the early history of the fishery. While these environmental links seem plausible and even logical, the RG felt that they had not been firmly established and that the IPHC staff

should not dismiss the possibility of density-dependent effects. The RG supports the IPHC initiative to conduct more research on the environment targeted specifically to growth, reproduction and recruitment.

I.5 Alternative Models

The RG did not have the opportunity to explore runs made with different models. However, the group felt that IPHC staff would benefit from further analyses using more parsimonious model formulations such as modified versions of CAGEAN. Specifically, it would be interesting to conduct runs including the survey data and splitting the blocks of years for which constant selectivity is assumed (e.g. 1974-1990, 1991-1995).

[To view an enlarged version of Figure 1, right-click on it and select "View Image"]