Saturday, September 24, 2016

The first drug approval by FDA for treating Duchenne Muscular Dystrophy (DMD) - who win, who lose?

Early this year, I wrote an article about “Race for the first drug approval by FDA for treating Duchenne Muscular Dystrophy (DMD)”, now came a great news for DMD patients and the biotechnology company Sarepta Therapeutics. This week, FDA approved Sarepta’s Eteplirsen for treating DMD patients – the first drug approved by FDA for treating DMD. Approval of Eteplirsen is a great example of a drug approved for treating rare disease where FDA’s flexibility and sensitivity to the obstacles of drug development for rare diseases has brought forth a successful treatment.  

However, this approval has a lot of controversies. In April, 2016, FDA’s advisory committee voted against the approval citing that there was not sufficient evidence demonstrating the efficacy. It is rare that FDA did not following advisory committee’s recommendations. On the other hand, we can clearly see that there are a lot of pushes from the patient advocate groups orchestrated most likely by the company. FDA tends to bow to the public pressure in approving drugs where there may not be substantial evidence. It is shown in the approval of Eteplirsen for DMD in this case and in the approval of Flibanserin (Addyi or so-called female Viagra) last year for the treatment of pre-menopausal women with hypoactive sexual desire disorder (HSDD).  

Food Drug and Cosmetic Act says:
…the term ‘‘substantial evidence’’ means evidence consisting of adequate and well-controlled investigations, including clinical investigations, by experts qualified by scientific training and experience to evaluate the effectiveness of the drug involved, on the basis of which it could fairly and responsibly be concluded by such experts that the drug will have the effect it purports or is represented to have under the conditions of use prescribed, recommended, or suggested in the labeling or proposed labeling thereof.”

The adequate and well-controlled investigations should have the following features:
1.      A clear statement of the objectives of the investigation and a summary of the …methods of analysis
2.      The study uses a design that permits a valid comparison with a control to provide a quantitative assessment of drug effect. The protocol…should describe the study design precisely.
3.      The method of selection of subjects provides adequate assurance they have the disease or condition being studied.
4.      The method of assigning patients to treatment and control groups minimizes bias and …assure(s) comparability of the groups.
5.      Adequate measures are taken to minimize bias on the part of the subjects, observers, and analysts of the data.
6.      The methods of assessment of subjects’ response are well-defined and reliable.
7.      There is an analysis of the results of the study adequate to assess the effects of the drug.

Based on FDA’s briefing document for advisory committee meeting, there were a lot of issues that were against these principles for an adequate and well-controlled investigations. The primary efficacy endpoint was not statistically significant. The assay and assessment were not reliable. There was inadequate handling of missing data in analysis. There was inadequate use of the historical control. There were a lot of fishing expeditions - searching for the positive signals in largely negative studies.  

As mentioned in an article in Boston Globe "For a mother, a bittersweet victory as a long-sought drug is finally approved":
In approving the drug — called eteplirsen, it’s made by Cambridge-based Sarepta Therapeutics — the FDA overruled its own staff and advisers, who concluded there was not enough evidence it worked. Even if it does, it’s expected to help only 13 percent of the estimated 20,000 people in the United States with Duchenne.
But advocates like Christine McSherry begged the FDA to approve it, arguing it was their only hope

Studies to Support the Approval of Eteplirsen
Sample Size
Study Design
Protocol title
Study 28
Phase 1
Single arm, no concurrent control

Study 33
Phase 1/2
Open label, dose-ranging study

Study 201
Phase 2
12 (4 patients per arm)
Randomized, controlled study with two active dose groups versus placebo

Study 202
Phase 2
Open label extension

Right after FDA’s advisory committee voted against the approval of Eteplirsen, the Catch-22 for FDA was clear:
Does the agency approve a generally safe but possibly ineffective DMD treatment based on limited data, and then rely on post-marketing data to see if the treatment is really effective, possibly raising false hopes of families and young boys that believe the drug is working?
Does FDA reject eteplirsen, wait for more data to prove its effectiveness and possibly deny access to the treatment (for up to three years) until more concrete evidence of the treatment’s effectiveness?

FDA decided to go with the first option to approve eteplirsen. Eteplirsen is approved under accelerated approval regulatory pathway and Sarepta is required to conduct a confirmatory study after the approval. Theoretically if the confirmatory study cannot show the efficacy, the drug should be withdrew from the market. However, it will be extremely difficult or unlikely to withdraw the drug from the market even if the evidence about the efficacy is not substantial from the post-marketing study. FDA bows to the public pressure now and will face even more daunting pressure then. 

The approval of eteplirsen is the win for drug developer, for DMD patient advocate group, and for orphan drug development. The approval of eteplirsen is also the loss for FDA who approves a drug largely due to the public pressure.  

As for DMD patients, on the surface, it seems to be a victory to have a drug available, however, it is a drug with unproved benefit that may give the patients the false hope. Hopefully, the fight for efficacious treatment for DMD will continue and the approval of eteplirsen will not impede the further pursuit of DMD treatment by other companies.  

Tuesday, September 13, 2016

Free FDA Information Repository

To learn and understand the regulations in drug development and clinical trials, the documents from FDA’s website are extremely helpful. Luckily, there is another website available now for FDA information repository.

Today DIA regulatory affairs community organized a WebEx. Stephen Weitzman, Executive Director from the MedDATA Foundation gave an overview of the IRAI FREE FDA Information Repository. The IRAI repository was initiated under a contract with the FDA with the objective of implementing a new system for cataloging FDA materials in an easily accessible and searchable format for research and training at academic universities and for all other stakeholders interested in FDA’s law, regulation, policies and procedures.

IRAI-Online.Org is open access to all and contains pretty all information from FDA’s website plus additional information that may not be available from FDA’s website or difficult to search on FDA’s website. Some clinical trial related information from NIH website are also included in this new FDA Information Repository website. To request an access, go to IRAI-ONLINE.ORG.

According to the presenter, Stephen Weitzman, the new FDA information repository is cataloged better and more searchable than FDA's website. It is organized by volume I to XXII (volume 1 to 22). The list of volumes is listed below.

Friday, September 02, 2016

Dealing with encoding issue in clinical trial data: WLATIN1 and UTF-8

Nowadays, the clinical trials go to global and are usually multinational. The data collection also goes to the electronic data capture (EDC) and the clinical trial data are entered directly by the investigational sites no matter whether the sites are in English-speaking countries or the non-English speaking countries. One issue we often run into is the data encoding issue.

Encoding is the process of transforming a set of Unicode characters into a sequence of bytes. In contrast, decoding is the process of transforming a sequence of encoded bytes into a set of Unicode characters.

To accommodate the multinational trials and the necessity of handling the non-English language characters, the EDC vendors may choose to use the encoding = UTF-8 for their data sets. However, when we use SAS for Windows system, the compatible encoding system is usually WLATIN1.

In the Windows environment, if we try to read a data encoded with UTF-8 format, we will get an error message such as below:

NOTE: Data file xxxxx is in a format that is native to another host, or the file encoding does not match the session encoding. Cross Environment Data Access will be used, which might require additional CPU resources and might reduce performance.
ERROR: Some character data was lost during transcoding in the dataset xxxxx Either the data contains characters that are not representable in the new encoding or truncation occurred during transcoding.

Here is also a discussion about this issue on SAS website.

To ensure that the data is transcoded correctly from one encoding to another, there are several ways. The following three papers provided very good explanations:

According to paper by Song, there are three ways to change the encoding:
1.    Force the transcoding by specifying that it needs to become WLATIN1, using the dataset option ENCODING=.
data x(encoding='WLATIN1');
set x;
The second approach is to use PROC DATASETS as below:
proc datasets lib=libname;
modify x/correctencoding='WLATIN1';
However, this way is NOT recommended: it only changes the encoder indicator but not actually translate the data itself!

When you would like to convert multiple SAS datasets from wlatin1 into UTF-8, you can use PROC MIGRATE.
proc migrate in=inlib out=outlib;
This migrates all SAS datasets in libname inlib to libname outlib. It retains SAS datasets labels as well. Note that inlib and outlib should be two different locations.

Also, we can use the following approaches:
1   1. inencoding option in libname statement.
libname in 'directory\' inencoding=asciiany;
data x;
   set in.x;
    2. Directly use encoding option after the data set
 proc sort data=RAWDM.AE(encoding='wlatin1') out=OUTSTATS.AE ;
by subject;run;
Here are some approaches / examples for resolving the data encoding issues from NLS reference guide:

Example 1: Creating a SAS Data Set with Mixed Encodings and with Transcoding Suppressed
By specifying the data set option ENCODING=ANY, you can create a SAS data set that contains mixed encodings, and suppress transcoding for either input or output processing.
In this example, the new data set MYFILES.MIXED contains some data that uses the Latin1 encoding, and some data that uses the Latin2 encoding. When the data set is processed, no transcoding occurs. For example, the correct Latin1 characters in a Latin1 session encoding and correct Latin2 characters in a Latin2 session encoding are displayed.
libname myfiles 'SAS data-library';
data myfiles.mixed (encoding=any);
set work.latin1;
set work.latin2;

Example 2: Creating a SAS Data Set with a Particular Encoding
For output processing, you can override the current session encoding. This action might be necessary, for example, if the normal access to the file uses a different session encoding.
For example, if the current session encoding is Wlatin1, you can specify ENCODING=WLATIN2 in order to create the data set that uses the encoding Wlatin2. The following statements tell SAS to write the data to the new data set using the Wlatin2 encoding instead of the session encoding. The encoding is also specified in the descriptor portion of the file.
libname myfiles 'SAS data-library';
data myfiles.difencoding (encoding=wlatin2);

Example 3: Using the FILE Statement to Specify an Encoding for Writing to an External File
This example creates an external file from a SAS data set. The current session encoding is Wlatin1, but the external file's encoding needs to be UTF-8. By default, SAS writes the external file using the current session encoding.
To specify what encoding to use for writing data to the external file, specify the ENCODING= option:
libname myfiles 'SAS data-library';
filename outfile 'external-file';
data _null_;
file outfile encoding="utf-8";
put Make Model Year;
When you tell SAS that the external file is to be in UTF-8 encoding, SAS then transcodes the data from Wlatin1 to the specified UTF-8 encoding.

Example 4: Using the FILENAME Statement to Specify an Encoding for Reading an External File
This example creates a SAS data set from an external file. The external file is in UTF-8 character-set encoding, and the current SAS session is in the Wlatin1 encoding. By default, SAS assumes that an external file is in the same encoding as the session encoding, which causes the character data to be written to the new SAS data set incorrectly.
To specify which encoding to use when reading the external file, specify the ENCODING= option: 
libname myfiles 'SAS data-library';
filename extfile 'external-file' encoding="utf-8";
data myfiles.unicode;
infile extfile;
input Make $ Model $ Year;
When you specify that the external file is in UTF-8, SAS then transcodes the external file from UTF-8 to the current session encoding when writing to the new SAS data set. Therefore, the data is written to the new data set correctly in Wlatin1.

Example 5: Using the FILENAME Statement to Specify an Encoding for Writing to an External File
This example creates an external file from a SAS data set. By default, SAS writes the external file using the current session encoding. The current session encoding is Wlatin1, but the external file's encoding needs to be UTF-8.
To specify which encoding to use when writing data to the external file, specify the ENCODING= option:
libname myfiles 'SAS data-library';
filename outfile 'external-file' encoding="utf-8";
data _null_;
file outfile;
put Make Model Year;
When you specify that the external file is to be in UTF-8 encoding, SAS then transcodes the data from Wlatin1 to the specified UTF-8 encoding when writing to the external file.
Example 6: Using the INFILE= Statement to Specify an Encoding for Reading from an External File
This example creates a SAS data set from an external file. The external file's encoding is in UTF-8, and the current SAS session encoding is Wlatin1. By default, SAS assumes that the external file is in the same encoding as the session encoding, which causes the character data to be written to the new SAS data set incorrectly.
To specify which encoding to use when reading the external file, specify the ENCODING= option: 
libname myfiles 'SAS data-library';
filename extfile 'external-file';
data myfiles.unicode;
infile extfile encoding="utf-8";
input Make $ Model $ Year;
When you specify that the external file is in UTF-8, SAS then transcodes the external file from UTF-8 to the current session encoding when writing to the new SAS data set. Therefore, the data is written to the new data set correctly in Wlatin1.

Incorrect encoding can be stamped on a SAS 7 or SAS 8 data set if it is copied or replaced in a SAS 9 session with a different session encoding from the data. The incorrect encoding stamp can be corrected with the CORRECTENCODING= option in the MODIFY statement in PROC DATASETS. If a character variable contains binary data, transcoding might corrupt the data.

Monday, August 01, 2016

Should hypothesis tests be performed and p-values be provided for safety variables in efficacy evaluation clinical trials?

p-value is the probability of observing a test statistic at least as large as the one calculated assuming the null hypothesis is true. In many situations, p-value from the hypothesis testing has been over-used, mis-used, or mis-interpreted. American Statistical Association seems to be fed up with the mis-use of the p-values and has formally issued a statement about the p-value (see AMERICAN STATISTICAL ASSOCIATION RELEASES STATEMENT ON STATISTICAL SIGNIFICANCE AND P-VALUES). It also provides the following six principles to improve the Conduct and Interpretation of Quantitative Science.
  •  P-values can indicate how incompatible the data are with a specified statistical model.
  •  P-values do not measure the probability that the studied hypothesis is true, or the probability that the data were produced by random chance alone.
  • Scientific conclusions and business or policy decisions should not be based only on whether a p-value passes a specific threshold.
  • Proper inference requires full reporting and transparency. 
  • A p-value, or statistical significance, does not measure the size of an effect or the importance of a result.
  • By itself, a p-value does not provide a good measure of evidence regarding a model or hypothesis.

However, we continue to see the cases that the p-values are over-used, mis-used, and mis-interpreted, or used for wrong purpose. One area for p-value misuse is in the analysis of safety endpoints such as adverse events and laboratory parameters in clinical trials. One of the ASA proposed principles of using p-value is “a p-value, or statistical significance, does not measure the size of an effect or the importance of a result” – ironically, this is also the reason for people to present p-values for tens and hundreds of p-values hoping that a p-value will measure the size of an effect or the important of a result.

In a recent article in New England Journal of Medicine, the p-values were provided for each adverse event even though the hypothesis testing to compare the incidence of each adverse event was not the intention of the study. In Marso et al (2016) Liraglutide and Cardiovascular Outcomes in Type 2 Diabetes, the following summary table was presented with p-values for individual adverse events.

The study protocol did not mention any inferential analysis for adverse events. It is clear that these p-values presented in the article are post-hoc and unplanned. Here is the analysis plan for AE in the protocol.

AEs are summarised descriptively. The summaries of AEs are made displaying the number of subjects with at least one event, the percentage of subjects with at least one event, the number of events and the event rate per 100 years. These summaries are done by seriousness, severity, relation to treatment, MESI, withdrawal due to AEs and outcome.”

In this same article, the appendix also presented p-values for cardiovascular and anti-diabetes medications at baseline and during trial. However, it could be misleading to interpret the results based on these p-values. For example, for Statins introduced during trial, the rates are 379 / 4668 = 8.1% in Liraglutide group and 450 / 4672 = 9.6% in Placebo group with a p-value of 0.01. However, while the p-value is statistically significant, the difference in rate (8.1% versus 9.6%) is not really meaningful.

Similarly, in another NEJM article (Goss et al (2016) Extending Aromatase-Inhibitor Adjuvant Therapy to 10 Years).  The p-values were provided for individual adverse events.

Usually, the clinical trials are designed to assess the treatment effect for efficacy endpoints, not for the safety endpoints such as adverse events and laboratory test results. For a clinical trial, there could be many different adverse events reported. Providing the p-value for each adverse event could be mis-interpreted as testing the statistical significant difference for each event between treatment groups. Uniformly and non-discretionary applying the hypothesis testing for tens and hundreds of different adverse event terms is against the statistical principle.

FDA Center for Drug Evaluation Research (CDER) has a reviewer guidance for Conducting a Clinical Safety Review of a New Product Application and Preparing a Report on the Review. It has the following statements about the hypothesis testing for safety endpoints.
Approaches to evaluation of the safety of a drug generally differ substantially from methods used to evaluate effectiveness. Most of the studies in phases 2-3 of a drug development program are directed toward establishing effectiveness. In designing these trials, critical efficacy endpoints are identified in advance, sample sizes are estimated to permit an adequate assessment of effectiveness, and serious efforts are made, in planning interim looks at data or in controlling multiplicity, to preserve the type 1 error (alpha error) for the main end point. It is also common to devote particular attention to examining critical endpoints by defining them with great care and, in many cases, by using blinded committees to adjudicate them. In contrast, with few exceptions, phase 2-3 trials are not designed to test specified hypotheses about safety nor to measure or identify adverse reactions with any pre-specified level of sensitivity. The exceptions occur when a particular concern related to the drug or drug class has arisen and when there is a specific safety advantage being studied. In these cases, there will often be safety studies with primary safety endpoints that have all the features of hypothesis testing, including blinding, control groups, and pre-specified statistical plans.
In the usual case, however, any apparent finding emerges from an assessment of dozens of potential endpoints (adverse events) of interest, making description of the statistical uncertainty of the finding using conventional significance levels very difficult. The approach taken is therefore best described as one of exploration and estimation of event rates, with particular attention to comparing results of individual studies and pooled data. It should be appreciated that exploratory analyses (e.g., subset analyses, to which a great caution is applied in a hypothesis testing setting) are a critical and essential part of a safety evaluation. These analyses can, of course, lead to false conclusions, but need to be carried out nonetheless, with attention to consistency across studies and prior knowledge. The approach typically followed is to screen broadly for adverse events and to expect that this will reveal the common adverse reaction profile of a new drug and will detect some of the less common and more serious adverse reactions associated with drug use. Identifying Common and Drug-Related Adverse EventsFor common adverse events, the reviewer should attempt to identify those events that can reasonably be considered drug related. Although it is tempting to use hypothesis-testing methods, any reasonable correction for multiplicity would make a finding almost impossible, and studies are almost invariably underpowered for statistically valid detection of small differences. The most persuasive evidence for causality is a consistent difference from control across studies, and evidence of dose response. The reviewer may also consider specifying criteria for the minimum rate and the difference between drug and placebo rate that would be considered sufficient to establish that an event is drug related (e.g., for a given dataset, events occurring at an incidence of at least 5 percent and for which the incidence is at least twice, or some other percentage greater than, the placebo incidence would be considered common and drug related). The reviewer should be mindful that such criteria are inevitably arbitrary and sensitive to sample size. Standard Analyses and Explorations of Laboratory DataThis review should generally include three standard approaches to the analysis of laboratory data. The first two analyses are based on comparative trial data. The third analysis should focus on all patients in the phase 2 to 3 experience. Analyses are intended to be descriptive and should not be thought of as hypothesis testing. P-values or confidence intervals can provide some evidence of the strength of the finding, but unless the trials are designed for hypothesis testing (rarely the case), these should be thought of as descriptive. Generally, the magnitude of change is more important than the p-value for the difference.
PhUSE is an independent, not-for-profit organisation run by volunteers. Since its inception, PhUSE has expanded from its roots as a conference for European Statistical Programmers, to a global platform for the discussion of topics encompassing the work of Data Managers, Biostatisticians, Statistical Programmers and eClinical IT professionals. PhUSE is run by the statistical programmers, but it is attempting to put together some guidelines about how the statistical tables should be presented. I guess that statisticians may not agree with all of their proposals.

PhUSE has published a draft proposal Analyses and Displays Associated with Adverse Events – Focus on Adverse Events in Phase 2-4 Clinical Trials and Integrated Summary Documents”. The proposal has a specific section about presentation of p-values for adverse event summary tables.
6.2. P-values and Confidence IntervalsThere has been ongoing debate on the value or lack of value for the inclusion of p-values and/or confidence intervals in safety assessments (Crowe, et. al. 2009). This white paper does not attempt to resolve this debate. As noted in the Reviewer Guidance, p-values or confidence intervals can provide some evidence of the strength of the finding, but unless the trials are designed for hypothesis testing, these should be thought of as descriptive. Throughout this white paper, p-values and measures of spread are included in several places. Where these are included, they should not be considered as hypothesis testing. If a company or compound team decides that these are not helpful as a tool for reviewing the data, they can be excluded from the display.
Some teams may find p-values and/or confidence intervals useful to facilitate focus, but have concerns that lack of “statistical significance” provides unwarranted dismissal of a potential signal. Conversely, there are concerns that due to multiplicity issues, there could be over-interpretation of p-values adding potential concern for too many outcomes. Similarly, there are concerns that the lower- or upper-bound of confidence intervals will be over-interpreted. It is important for the users of these TFLs to be educated on these issues.
Similarly, PhUSE also has a white paper on “Analyses and Displays Associated with Demographics, Disposition, and Medications in Phase 2-4 Clinical Trials and Integrated Summary Documents “where p-values in summary table for demographics and concomitant medications are also discussed.
6.1.1. P-values There has been ongoing debate on the value or lack of value of the inclusion of p-values in assessments of demographics, disposition, and medications. This white paper does not attempt to resolve this debate. Using p-values for the purpose of describing a population is generally considered to have no added value. The controversy usually pertains to safety assessments. Throughout this white paper, p-values have not been included. If a company or compound team decides that these will be helpful as a tool for reviewing the data, they can be included in the display.
It is very common that the p-values are provided for the demographic and baseline characteristics to make sure that there is a balance (no difference between treatment groups) in key demographic and baseline characteristics. These demographic and baseline characteristics are usually the factors for performing the sub-group analysis.

It is also very common that the p-values are not provided for safety and ancillary variables such as adverse events, laboratory parameters, concomitant medications, and medical histories. The obvious concerns are about the multiplicity, lack of pre-specification, the interpretation of these p-values, and mis-interpretation of p-value as a measure of the importance of a result.  The safety analyses are still mainly on summary basis unless the specific safety variables are pre-specified for hypothesis testing. The safety assessment is sometimes based on the qualitative analysis rather than the quantitative analysis – this is why the narratives for serious adverse events (SAEs) place a critical role in safety assessment. For example, it is well known now the drug Tysabri is effective in treating the relapsing-remitting multiple sclerosis, but increases the risk of progressive multifocal leukoencephalopathy (PML), an opportunistic viral infection of the brain that usually leads to death or severe disability. PML is very rare and is not supposed to be seen in clinical trial subjects. If any PML case is reported in Tysabri treatment group, it will be considered as significant even though the p-value may not be.

Friday, July 15, 2016

Protocol amendment in clinical trials

For every clinical trial, study protocol is the centerpiece and study protocol dictates how the study should be conducted, what data will be collected, and how the data will be analyzed. Usually, after the IND (Investigational New Drug) including the study protocol is filed and FDA does not provide any comments (or put it on clinical hold) within 30 days, the sponsor will consider the study protocol is granted approval to proceed. However, it is very common that during the study conduct, some aspects of the study protocol needs to be changed or amended.

Protocol amendment is guided by the Code of Federal Register (CFR) section 312.30 Protocol amendments. CFR states the followings regarding the protocol amendments:
(b) Changes in a protocol. (1) A sponsor shall submit a protocol amendment describing any change in a Phase 1 protocol that significantly affects the safety of subjects or any change in a Phase 2 or 3 protocol that significantly affects the safety of subjects, the scope of the investigation, or the scientific quality of the study. Examples of changes requiring an amendment under this paragraph include:
(i) Any increase in drug dosage or duration of exposure of individual subjects to the drug beyond that in the current protocol, or any significant increase in the number of subjects under study.
(ii) Any significant change in the design of a protocol (such as the addition or dropping of a control group).
(iii) The addition of a new test or procedure that is intended to improve monitoring for, or reduce the risk of, a side effect or adverse event; or the dropping of a test intended to monitor safety.
One question that is often asked is whether or not the protocol needs to be amended if the only changes to the protocol are about the statistical analyses or the sample size. The changes in statistical analysis or the sample size can be considered as changes “that significantly affects the scope of the investigation, or scientific quality of the study”, therefore subject to the protocol amendment. Notice the word ‘significantly’, which implies that some changes may be allowed without amending the protocol, for example small increase (usually less than 10% of total subjects) in sample size, and changes in the statistical analysis methods for secondary or exploratory endpoints, 

In the latest issue of “Therapeutic Innovation and Regulatory Science”, Getz et al published a paper “the impact of protocol amendment on clinical trial performance and cost”. They found that 57% of protocols had at least one substantial amendment, and nearly half (45%) of these amendments were deemed ‘avoidable’. Phase II and III protocols had a mean number of 2.2 and 2.3 global amendments, respectively. My experiences with clinical trials in rare diseases indicates even a high percentage (almost every protocol) of protocols with amendments and a high number of protocol amendments.
The protocol amendments has great impact on the conduct of the clinical trials:
  • Significant impact on the cost
  • Significant impact on the timeline
  • Significant impact on the resources
  • May have significant impact on the credibility of the study results
  • May result in more protocol deviations

“Unplanned delays, disruptions, and costs associated with implementing protocol amendments have long challenged drug development companies and their contract research partners. Despite a rigorous and extensive internal review and approval process, the majority of finalized protocols are amended multiple times – particularly those directing later-stage phase III studies.
 The frequency of protocol amendments varies by therapeutic area and is highly correlated with more scientifically and operationally complex protocols. Increased amendment frequency per protocol is associated with protocols that have a higher relative number of protocol procedures and eligibility criteria, and more investigative sites dispersed across more countries.
Amendments are implemented for a wide variety of reasons, including the introduction of new standards of care, changes to medications permitted before and during the clinical trial, the availability of new safety data, and requests from regulatory agencies and other oversight organizations (eg, ethical review boards). The top reason for amending a protocol is to modify study volunteer eligibility criteria due to changes in study design strategy and difficulties recruiting patients. “
Some large pharmaceutical companies start to look at the impact of the protocol amendment on the overall cost and the timeline. I even heard (unconfirmed) that GSK used the number of protocol amendment as one of the performance evaluation criteria. The less number of the protocol amendments, the better the performance is.

The number of protocol amendments may be the results of a specific study design. A recent discussion about the phase I dose cohort expansion study results in unlimited number of protocol amendments. This specific design has been discussed in my previous posting. As an example, a dose cohort expansion study by Merck has 50 protocol amendments and still counting.

Adaptive design has been a hot topic in clinical trial field for last ten years. The enthusiasm about the adaptive design has died down a little bit except in oncology area. One of the key advantages for adaptive design is to implement the changes based on pre-specified criteria and therefore avoid the protocol amendment. For example, if all criteria for pruning the treatment arms or for increasing the sample size are pre-specified, when the criteria are met, there is no need for protocol amendment to implement the changes. This could be good in saving the time/cost, but may be bad due to the loss in learning opportunities.  

For traditional study designs, it is always desirable to minimize the number of the protocol amendments. In reality, the protocol amendments are ubiquitous. The reason for protocol amendments may include the followings (it is not intended to be an exclusive list).
  • Lack of internal expertise in the therapeutics area
  • Lack of consultation from external experts
  • Lack of Engagement with steering committee
  • Lack of engagement with CRO who may have the first hand experiences about the sites.
  • Lack of experience from other countries – standard care may be very different in other countries
  • Too late in engaging the statisticians – statistician should engage in the study design including endpoint selection, not just calculating the sample size.
  • Sign off the final protocol too early, for example the protocol was signed off before pre-IND meeting with FDA
  • Submitting the final protocol when the concept protocol, protocol synopsis, or draft protocol suffice
  • Inadequate or unrealistic inclusion/exclusion criteria
  • Lack of quality control in protocol review / approval process.
It is often that the protocol amendment is triggered by the following – eventually the protocol may go through several round of amendments before the first patients is enrolled into the study.
  • Protocol amendment after FDA pre-IND meeting
  • Protocol amendment per external committees requests – such as Data Monitoring Committee (DMC), Steering Committee
  • Protocol amendment after the investigator meeting
  • Protocol amendment after FDA’s IND comments
  • Protocol amendment due to the difficulties in patient enrollment
  • Protocol amendment after blinded interim analysis
  • Protocol amendment due to expansion in the number of countries
For multi-national clinical trials, there may be situations that a particular country's regulatory authority will require a slight deviation to an IND study protocol. This may be implemented through country-specific protocol amendments. However, for "country-specific" protocol amendments for international studies, if the data will support a marketing application, FDA will want to know what was done differently in those countries, so the amendments would need to be submitted to FDA. If the study is under an IND at the non-U.S. sites, then these amendments would need to be submitted as specified under 21 CFR 312.30. If the international sites are not officially under the IND, this information would need to accompany the data in the marketing application at the very least.

It is not a good idea to have a country specific protocol amendment with significant deviation to an IND study protocol. For example, I used to work on a randomized, placebo controlled study where the regulatory authority in one of the targeted countries did not approve the inclusion of the placebo arm in the study. I was asked if a country-specific protocol could be used so that the placebo arm could be dropped from the protocol for this specific country. In this case, the deviation to the IND study protocol seems to be too big. The country-specific protocol is not a good solution and this specific country may need to be excluded from the study participation.

A small tip for CRF/eCRF revision due to the protocol amendment :
When inclusion/exclusion criteria are revised in protocol amendment, to avoid the potential impact on the CRF data collection and the downstream activities, it is better to:
  • Keep the inclusion / exclusion number intact (i.e., skip the number) if one or more of them are removed, for example if the inclusion criteria #3 is removed, the amended protocol will have inclusion criteria 1, 2, 4, 5 (i.e., #3 is skipped).
  • Add additional inclusion/exclusion criteria after the last existing inclusion or exclusion riterion if additional inclusion/exclusion criteria need to be added, 

Tuesday, July 05, 2016

Some Blinding Techniques in Clinical Trials

In randomized controlled clinical trials, the blinding is one of the key components. The purpose of the blinding to the treatment assignment is to avoid the conscious or unconscious biases in assessing the efficacy and safety endpoints, therefore, to maintain the integrity of the study. How much important of the blinding? Look at the investigator initiated studies, early phase trials – many of them had positive results, but later was demonstrated untrue.

In terms of the blinding technique, researchers should look for 3 qualities: it must successfully conceal the group allocation; it must not impair the ability to accurately assess outcomes; and it must be acceptable to the individuals that will be assessing outcomes. In some clinical trials, not all these three qualities can be met.

Based on how the blinding is maintained, the clinical trials can be categorized as open-label study, single-blind study, and double-blind study.

The open label study (may be called 'open study' in EU countries) is a study with both the investigator and the subject knowing the treatment the subject is receiving. The open label study can be a study without any control group or can be a randomized, controlled, open label study.

The single blind study is a study with investigator knowing the treatment assignment and with the subject not knowing which treatment he/she is receiving.

The double blind study is a study with both investigator and subject not knowing which treatment the subject is receiving.

  • The blinding is defined based on whether or not the investigator and subject know the treatment assignment, however, in industry, additional parties who are involved in the managing and conducting the clinical trials may also be blinded to the treatment assignment. For example, in double-blind studies, the study team on the sponsor side, the CRO, and vendors are usually also blinded.

  • There is an extended term ‘triple blind study' which is defined as a double-blind study in which, in addition, the identities of those enrolled in the study and control groups and/or the details about the nature of the interventions (experimental medications), are withheld from the statistician(s) who conduct the analysis of the data. Since the study statisticians (with exception of the DMC statisticians) are part of the study team and usually remain blinded to the treatment assignments during the study, the ‘double blind study’ is usually operated as a triple blind study in practice. This is why we rarely see the term ‘triple blind study’ is actually used in clinical trials.

  • In practice, for a single blind study, it is usually better to be conservative to treat the single blind study as if a double blind study for the study teams.
The blinding is usually easy to operate if the investigational products are pills. The pills for investigational products and the control products can be manufactured to be identical in size, color, smell,...   There are clinical trials where the comparison involves different route of drug administrations, different type of surgical procedures, or different devices. In these situations, the blinding of the treatment assignments seems to be impossible. However, this may not be entirely true. There are still some techniques or approaches that can be employed to have certain levels of the blinding to minimize the biases in assessing the efficacy and safety endpoints. 

Using sham treatment: Sham treatment is an inactive treatment or procedure (usually a medical procedure) that is intended to mimic as closely as possible a therapy in a clinical trial. Sham treatment is given to the subjects in the control group to mimic the investigational treatment group. With the use of the sham treatment, a seemly impossibly blinded study can now be blinded. Here are some examples that sham treatment is used in the study. 

Separating the treating physician and the evaluation or examining physician: in clinical trials where a separate physician other than the treating physician or investigator is employed to assess the efficacy or safety. The treating physician and the examining physician are separate and do not communicate about the assessment results. Treating physician can be unblinded to treatment assignment, but the examining physician is blinded. Therefore, the blinding is maintained. This type of arrangements is useful and necessary in neurological trials especially in multiple sclerosis trials where many subjective scales are used. In EMA (2006) GUIDELINEON CLINICAL INVESTIGATION OF MEDICINAL PRODUCTS FOR THE TREATMENT OF MULTIPLESCLEROSIS, the following is described:
“As several subjective decisions and assessments will have to be performed, with a considerable risk of bias, all possible efforts should be done to keep the design double blind. In cases where double blind is not possible (some active comparator trials, some easily unblinded treatments,...) a blind observer design with a blinded examining physician different than the treating physician may be used. All measures to ensure reliable single blind evaluation should be guaranteed (i.e. patches that cover injection sites to hide reddening or swellings, education of examining physicians,…).”
Similarly, in FDA's guidance for industry "Rare Diseases: Common Issues in Drug Development", it stated: 

...As another example, effective blinding of treatments can reduce concern about bias in the subjective aspects of an assessment, as can conduct of endpoint evaluation by people not involved in other aspects of the trial (e.g., radiologists, exercise testers).

Some of the examples are: 

Central reader with the blinded clinical data: using central reader for image endpoints where the central reader is blinded to the clinical data to avoid the biases. Additional blinding can also be employed through blinding of the baseline image and the subsequent images so that the changes (for example the tumor size) assessed by the central reader can be more reliable.

"In unblinded clinical trials, clinical information may bias a site-based image interpretation because the expected relation of clinical features to outcome is known and, therefore, local reading will raise concern about potential unblinding. A centralized image interpretation process, fully blinded, may greatly enhance the credibility of image assessments and better ensure consistency of image assessments. Some imaging modalities also may prove vulnerable to site-specific image quality problems, and a centralized imaging interpretation process may help minimize these problems. For example, the National Lung Screening Trial’s experience with computed tomography of the chest suggested that centralized image quality monitoring was important to the reduction of imaging defects (Gierada, Garg, et al. 2009). Hence, a centralized image interpretation process may be used to help control image quality as well as to provide the actual imaging-based endpoint measurements."
"In a time-sequential presentation, a subject’s complete image set (from baseline through the follow-up evaluations) is shown in the order in which the images were obtained. In this process (unless prespecified and justified in the charter), the reader does not initially know the total number of time points in each subject’s image set.
 In a hybrid, randomized image presentation, a subject’s complete image set (or only the postbaseline images) are shown fully randomized. After the read results have been locked for each time point, the images are shown again in known chronological order for re-read. Changes in any of the randomized assessments are tracked and highlighted in the final assessment. In within-subject-control trials (e.g., comparative imaging), images obtained before and after the investigational drug should be presented in fully randomized unpaired fashion and in randomized paired fashion in two separate image evaluations. The minimum number of images in each randomized block necessary to minimize recall should be considered."
Firewall to prevent the sponsor from performing the aggregate analysis: Building a firewall between the sponsor and the Data Monitoring Committee is a technique that is necessary to make sure that the study integrity is maintained. The firewall between the sponsor the investigator/clinical research organization can also be implemented in an open label study or single-blind study so that the sponsor is prevented to access the cumulative data for the primary efficacy endpoint for analysis. The primary efficacy endpoint information is accessible to the investigator and the CROs, but withheld from the sponsor. While the investigator and CRO may have some biases due to knowing the treatment assignment, but the biases from the sponsor side may be prevented.

Additional reading:

1.      Kenneth F Schulz, David A Grimes (2002) Blinding in randomised trials: hiding who got what
2.      Karanicolas et al (2009) Blinding:Who, what, when, why, how?