Quality control
quality_control.Rmd
Data analysis is vital step, but firsly, we must make sure that the experiment was conducted correctly. How to do that using HaDeX functionalities? There are multiple ways to do that. Unfortunately, our example data is good quality, but we will discuss possible indicators of data that needs to be re-checked.
Repliates
Firsly, we check the number of replicates. Although the experiment is conducted in known number of times, sometimes during the manual curation of isotopic envelopes, some ofthe spectras are disqualified. Here, we check the remaing number of replicates of the already cureated dataset.
It is advised to have at least three replicates of the experiment. Lesser number limits the statisitcal analysis - e.q. for calculating P-value using T-student test minimum of three repliates are needed. If there are only two replicates, the uncertainty can be calculated.
In the example plot below, we see that there are three peptides in this state that are lacking values. One peptide - with considerably lack of replicates potentially should be eliminated from the peptide pool. Majority of peptides have a stable number of replicates - in this case three, with expception of no-deuterated control sample, measured only once.
rep_dat <- create_replicate_dataset(alpha_dat)
plot_replicate_histogram(rep_dat)
Uncertainty
Uncertainty plot presents the uncertainty of the measurement for one state at the time. In this case, we present the time points separately. There is no uncertainty calculated for time point 0 min, as it is treated as undeuterated control and measured only once. For other time points, as the measurement was conducted in triplicate, there are values for measured mass. As the measured mass is in daltons, also the uncertainty is presented in daltons. Thus, we can establish a threshold of 1 dalton - indicating one exchange between proton and deuter. Any uncertainty exceeding - or coming close - to this threshold should be carefully checked.
For correctly conducted experiment, the uncertainty for all peptides in all time points should be close to 0.
In the example below, all measurement have acceptable level of uncertainty. Of course, those values differ depending on the region or peptide length, but overall are not exceeding 0.25 Da.
plot_uncertainty(alpha_dat, state = "Alpha_KSCN")
Measurements
Once we have the overview of the situation, we can dig deeper into the peptides of interest.
Below, we present the measurement plots for peptide “GDGDLKSPAGL” in one state, for three replicates. The measurements were done for two possible charge values: 1 and 2, indicated by color. We see that the latter was occuring with greater intensity, indicated by the point size. The aggregated value for each replicate was done by weigting the mass measurements with regard to their measured intensity. Then, the we calculated the mean value, that is the final aggregated value for this peptide in this time point. This value is indicated by the red vertical line, with red area indicating the uncertainty surrounding calculated value.
plot_peptide_mass_measurement(alpha_dat)
The peptide recognized as under-measured using previous plot, is peptide “CVRSIQA” in state eEF1B. We analyse the mass uptake with regards to replicates (aggregated charge) and we see that for time point 25 min only one measurement was accepted, and for other time points two, thus blocking the possibility of calculating statistical significance when using this peptide in comparative analysis.
plot_replicate_mass_uptake(alpha_dat, sequence = "CVRSIQA", aggregated = TRUE)
Back-exchange
Imporant measure of quality of experiment is back-exchange. Usually, expected value of back-exchange should be around 30%, possibly higher for shorter peptides.
bex_dat <- calculate_back_exchange(alpha_dat, state = "Alpha_KSCN")
plot_coverage_heatmap(bex_dat, value = "back_exchange")