Skip to contents

Let’s discuss differences between the first version of HaDeX (Puchała et al. 2020), and the current version of the package and web-server. As the second version is significantly more complex than the previous one, we ultimately decided against expanding the first version. Instead we created a new entity. But how are they different?

Methods of visualization

Let’s start by discussing methods of visualization available in the packages and web-servers:

plot_type HaDeX HaDeX2
comparison TRUE TRUE
woods TRUE TRUE
uptake curve TRUE TRUE
diff uptake curve FALSE TRUE
butterfly FALSE TRUE
diff butterfly FALSE TRUE
chiclet FALSE TRUE
diff chiclet FALSE TRUE
heatmap FALSE TRUE
diff heatmap FALSE TRUE
3D structure FALSE TRUE
volcano FALSE TRUE
manhattan FALSE TRUE
uncertainty FALSE TRUE
coverage TRUE TRUE
coverage heatmap FALSE TRUE
measurement variablity FALSE TRUE
mass uptake curve FALSE TRUE

Web-server features

Let’s see how the features in web-servers differ.

option HaDeX HaDeX2
tooltips TRUE TRUE
helpers TRUE TRUE
tabular data TRUE TRUE
times next to each other FALSE TRUE
export to external tools FALSE TRUE

In the table there are shortened names. Tabular data means that the tabular data is available for the plot. Times next to each other means the option of showing all time point data in two forms: one plot or smaller plots next to each other for time point separately. Export to external tools means an option to download data for tools such as HDXViewer or ChimeraX.

Moreover, in HaDeX2 more visualization options are available.

Package functions

Finally, let’s take a look at package functions from HaDeX2 and see if there is a similar function in HaDeX (even under different name):

HaDeX HaDeX2
HaDeX_gui HaDeX_GUI
add_stat_dependency add_stat_dependency
calculate_confidence_limit_values calculate_confidence_limit_values
calculate_kinetics calculate_kinetics
calculate_state_deuteration create_state_uptake_dataset
comparison_plot plot_state_comparison
plot_coverage plot_coverage
plot_kinetics plot_uptake_curve
plot_position_frequency plot_overlap_distribution
read_hdx read_hdx
reconstruct_sequence reconstruct_sequence
woods_plot plot_differential
NA HaDeXify
NA calculate_MHP
NA calculate_aggregated_diff_uptake
NA calculate_aggregated_test_results
NA calculate_aggregated_uptake
NA calculate_auc
NA calculate_back_exchange
NA calculate_diff_uptake
NA calculate_exp_masses
NA calculate_exp_masses_per_replicate
NA calculate_p_value
NA calculate_peptide_kinetics
NA calculate_state_uptake
NA create_aggregated_diff_uptake_dataset
NA create_aggregated_uptake_dataset
NA create_control_dataset
NA create_diff_uptake_dataset
NA create_kinetic_dataset
NA create_overlap_distribution_dataset
NA create_p_diff_uptake_dataset
NA create_p_diff_uptake_dataset_with_confidence
NA create_replicate_dataset
NA create_state_comparison_dataset
NA create_uptake_dataset
NA get_n_replicates
NA get_peptide_sequence
NA get_protein_coverage
NA get_protein_redundancy
NA get_replicate_list_sd
NA get_residue_positions
NA get_structure_color
NA install_GUI
NA plot_aggregated_differential_uptake
NA plot_aggregated_uptake
NA plot_aggregated_uptake_structure
NA plot_amino_distribution
NA plot_butterfly
NA plot_chiclet
NA plot_coverage_heatmap
NA plot_differential_butterfly
NA plot_differential_chiclet
NA plot_differential_uptake_curve
NA plot_manhattan
NA plot_overlap
NA plot_peptide_charge_measurement
NA plot_peptide_mass_measurement
NA plot_position_frequency
NA plot_quality_control
NA plot_replicate_histogram
NA plot_replicate_mass_uptake
NA plot_uncertainty
NA plot_volcano
NA prepare_hdxviewer_export
NA quality_control_dataset
NA show_aggregated_uptake_data
NA show_coverage_heatmap_data
NA show_diff_uptake_data
NA show_diff_uptake_data_confidence
NA show_overlap_data
NA show_p_diff_uptake_data
NA show_peptide_charge_measurement
NA show_peptide_mass_measurement
NA show_quality_control_data
NA show_replicate_histogram_data
NA show_summary_data
NA show_uc_data
NA show_uptake_data
NA update_hdexaminer_file

Calculation speeds compared

For each pair of functions in the previous section, we can assess relative execution speed using the exemplary dataset as a controlled reference for comparison. We concentrate on six major tasks: reading data file, plotting (and preparing data) uptake curve for one peptide, comparison plot of two biological states, differential Woods plot with difference between two states, reconstruction of the protein sequence and computation of confidence limits.

Code used to create this comparison is shown below, and then the results are presented in a comprehensive form.

library(HaDeX)

dat_HaDeX <- HaDeX::read_hdx(system.file(package = "HaDeX2", "HaDeX/data/alpha.csv"))
dat_HaDeX2 <- HaDeX2::read_hdx(system.file(package = "HaDeX2", "HaDeX/data/alpha.csv"))

version_benchmark <- microbenchmark(
  list = alist(`HaDeX_1. Read input` = HaDeX::read_hdx(system.file(package = "HaDeX2", 
                                                                   "HaDeX/data/alpha.csv")),
               `HaDeX2_1. Read input` = HaDeX2::read_hdx(system.file(package = "HaDeX2", 
                                                                     "HaDeX/data/alpha.csv")),
               `HaDeX_2. Plot uptake curve` = {
                 HaDeX::calculate_kinetics(dat = dat_HaDeX, 
                                           sequence = "GFGDLKSPAGL",      
                                           state = "Alpha_KSCN", 
                                           start = 1, end = 11, 
                                           time_in = 0, time_out = 1440) %>%   
                   HaDeX::plot_kinetics(kin_dat = .)},
               `HaDeX2_2. Plot uptake curve` = {
                 HaDeX2::calculate_peptide_kinetics(dat = dat_HaDeX2,
                                                    sequence = "GFGDLKSPAGL",
                                                    state = "Alpha_KSCN",
                                                    start = 1, end = 11,
                                                    time_0 = 0, time_100 = 1440) %>%
                   HaDeX2::plot_uptake_curve(uc_dat = .)},
               `HaDeX_3. Plot comparison` = {
                 HaDeX::prepare_dataset(dat = dat_HaDeX, 
                                        in_state_first = "Alpha_KSCN_0",      
                                        chosen_state_first = "Alpha_KSCN_1", 
                                        out_state_first = "Alpha_KSCN_1440",      
                                        in_state_second = "ALPHA_Gamma_0", 
                                        chosen_state_second = "ALPHA_Gamma_1",      
                                        out_state_second = "ALPHA_Gamma_1440") %>%
                   HaDeX::comparison_plot(calc_dat = ., 
                                          theoretical = FALSE,      
                                          relative = TRUE, 
                                          state_first = "Alpha_KSCN", 
                                          state_second = "ALPHA_Gamma")},
               `HaDeX2_3. Plot comparison` = {
                 HaDeX2::create_state_comparison_dataset(dat = dat_HaDeX2,
                                                         states = c("Alpha_KSCN", 
                                                                    "ALPHA_Gamma"),
                                                         time_0 = 0, time_100 = 1440) %>%
                   HaDeX2::plot_state_comparison(uptake_dat = .,
                                                 theoretical = FALSE,
                                                 fractional = TRUE,
                                                 time_t = 1)},
               `HaDeX_4. Plot Woods` = {
                 HaDeX::prepare_dataset(dat = dat_HaDeX, 
                                        in_state_first = "Alpha_KSCN_0",      
                                        chosen_state_first = "Alpha_KSCN_1", 
                                        out_state_first = "Alpha_KSCN_1440",      
                                        in_state_second = "ALPHA_Gamma_0", 
                                        chosen_state_second = "ALPHA_Gamma_1",      
                                        out_state_second = "ALPHA_Gamma_1440") %>%
                   HaDeX::woods_plot(calc_dat = ., 
                                     theoretical = FALSE,      
                                     relative = TRUE, 
                                     confidence_limit = 0.98, 
                                     confidence_limit_2 = 0.98)},
               `HaDeX2_4. Plot Woods` = {
                 HaDeX2::calculate_diff_uptake(dat = dat_HaDeX2,
                                               states = c("Alpha_KSCN", "ALPHA_Gamma"),
                                               time_t = 1, time_0 = 0, time_100 = 1440) %>%
                   HaDeX2::plot_differential(diff_uptake_dat = .,
                                             time_t = 1,
                                             theoretical = FALSE,
                                             fractional = TRUE,
                                             show_houde_interval = TRUE,
                                             confidence_level = 0.98)},
               `HaDeX_5. Calculate confidence limit` = {
                 HaDeX::prepare_dataset(dat = dat_HaDeX, 
                                        in_state_first = "Alpha_KSCN_0",      
                                        chosen_state_first = "Alpha_KSCN_1", 
                                        out_state_first = "Alpha_KSCN_1440",      
                                        in_state_second = "ALPHA_Gamma_0", 
                                        chosen_state_second = "ALPHA_Gamma_1",      
                                        out_state_second = "ALPHA_Gamma_1440") %>%
                   HaDeX::calculate_confidence_limit_values(calc_dat = .,
                                                            confidence_limit = 0.98,
                                                            theoretical = FALSE,
                                                            relative = TRUE)},
                `HaDeX2_5. Calculate confidence limit` = {
                 HaDeX2::calculate_diff_uptake(dat = dat_HaDeX2,
                                               states = c("Alpha_KSCN", "ALPHA_Gamma"),
                                               time_0 = 0, time_100 = 1440, time_t = 1) %>%
                   HaDeX2::calculate_confidence_limit_values(diff_uptake_dat = .,
                                                             confidence_level  = 0.98,
                                                             theoretical = FALSE,
                                                             fractional = TRUE)},
               `HaDeX_6. Reconstruct sequence` = HaDeX::reconstruct_sequence(dat = dat_HaDeX),
               `HaDeX2_6. Reconstruct sequence` =  HaDeX2::reconstruct_sequence(dat = dat_HaDeX2)
               
               )
)

The microbenchmark package operates by executing each command 100 times and summarizing the resulting execution times in nanoseconds. The results are then presented side by side in graphical form to facilitate a direct and intuitive comparison across commands.

All the tasks are completed significantly faster when using HaDeX2.

Below, we present the summary of numeric values used to create this plot (in milliseconds):

task HaDeX HaDeX2 Runtime ratio
1. Read input 34.83070 28.86460 0.8287115
2. Plot uptake curve 171.86675 65.38155 0.3804200
3. Plot comparison 186.86305 59.34140 0.3175663
4. Plot Woods 201.53030 77.75905 0.3858430
5. Calculate confidence limit 172.91570 53.16090 0.3074383
6. Reconstruct sequence 24.18105 15.78765 0.6528935

Across all tasks, the reported values represent a runtime ratio (HaDeX2/HaDeX) consistently below one, indicating that HaDeX2 is faster than HaDeX for every measured operation. The strongest speedups, corresponding to the lowest ratios, are observed for plotting functions, calculating confidence limits, and plotting uptake curves, whereas input reading and sequence reconstruction show comparatively smaller, though still meaningful, reductions in execution time. In the case of input reading, the modest speed-up results from the fact that this functionality has substantially expanded in-built quality control in HaDeX2, where additional validation steps intentionally constrain maximal speed in favor of improved data integrity.

HaDeX2 design

The first version of HaDeX was developed quickly to address immediate data analysis challenges. As knowledge in the field expanded, it became necessary to extend the package’s functionality. This required a carefully planned redesign. The package is now built from small, modular computing blocks—encapsulated functions that each perform a single task. Datasets are created by combining these functions. This design allows individual components to be tested independently and improves code readability through self-explanatory function names (calculate_ provides results for specific time point, but create_dataset_ for all time points). The parameter naming conventions were also simplified. In addition, the graphical user interface was rewritten from scratch using Shiny modules to ensure clear separation and encapsulation of features.

Puchała, Weronika, Michał Burdukiewicz, Michał Kistowski, Katarzyna A. Dąbrowska, Aleksandra E. Badaczewska-Dawid, Dominik Cysewski, and Michał Dadlez. 2020. HaDeX: An R Package and Web-Server for Analysis of Data from Hydrogen-Deuterium Exchange Mass Spectrometry Experiments.” Bioinformatics (Oxford, England) 36 (16): 4516–18. https://doi.org/10.1093/bioinformatics/btaa587.