Comprehensive, Quantitative Analysis of SRM 1950: the NIST Human Plasma Reference Material.
2025
Mandal R; Zheng J; Zhang L; Oler E; LeVatte MA; Berjanskii M; Lipfert M; Han J; Borchers CH; Wishart DS
Analytical chemistry
Vol. 97
(1)
, pp. 667-675
Many analytical methods have been developed for performing targeted metabolomics. By combining multiple analytical techniques, comprehensive coverage of the metabolome can be achieved. We combined multiple analytical techniques to comprehensively and quantitatively characterize the widely studied NIST human plasma reference material, SRM 1950. Our goal was to provide a large, well-validated list of confident metabolite concentration values (i.e., benchmarks) to assist the metabolomics community in its calibration and comparison efforts. We used four analytical platforms: high-resolution NMR spectroscopy, direct injection tandem MS (DI-MS/MS), liquid chromatography tandem MS (LC-MS/MS), and inductively coupled plasma MS (ICP-MS). Eight validated analytical assays were run, yielding accurate quantitative measurements for 728 unique metabolites or metabolite species. Through computer-aided literature mining, we identified another 330 unique metabolites previously quantified in SRM 1950. We compared NIST-certified values along with literature-derived concentrations/ranges to the metabolite concentrations measured by our four platforms and eight assays. From these assays/platforms, we generated a list of high-confidence concentration values of 1058 metabolites or metabolite species in SRM 1950 including data for 60 amino acids/related compounds, 48 bile acids, 72 amines/sugars/alcohols, 21 metals, 8 catecholamines, 11 vitamins, 92 organic acids, 40 fatty acids/steroids/nucleobases/indole derivatives, 5 polyfluorinated compounds, 7 carotenoids, 39 acylcarnitines, 76 oxylipins, 13 sterols, and 566 lipids/lipid species. This data set represents the most complete quantitative characterization of SRM 1950. An online database (SRM1950-DB) containing 1058 plasma metabolites/metabolite species in SRM 1950, their structures, HMDB IDs, mass, chemical class, concentrations, references, and reliability is freely available at https://srm1950-data.wishartlab.com.
The Natural Products Magnetic Resonance Database (NP-MRD) for 2025.
2025
Wishart DS; Sajed T; Pin M; Poynton EF; Goel B; Lee BL; Guo AC; Saha S; Sayeeda Z; Han S; Berjanskii M; Peters H; Oler E; Gautam V; Jordan T; Kim J; Ledingham B; Tretter ZM; Koller JT; Shreffler HA; Stillwell LR; Jystad AM; Govind N; Bade JL; Sumner LW; Linington RG; Cort JR
Nucleic acids research
Vol. 53
(D1)
, pp. D700-D708
The Natural Products Magnetic Resonance Database (NP-MRD; https://np-mrd.org) is a comprehensive, freely accessible, web-based resource for the deposition, distribution, extraction, and retrieval of nuclear magnetic resonance (NMR) data on natural products (NPs). The NP-MRD was initially established to support compound de-replication and data dissemination for the NP community. However, that community has now grown to include many users from the metabolomics, microbiomics, foodomics, and nutrition science fields. Indeed, since its launch in 2022, the NP-MRD has expanded enormously in size, scope, and popularity. The current version of NP-MRD now contains nearly 7x more compounds (281 859 versus 40 908) and 7x more NMR spectra (5.5 million versus 817 278) than the first release. More specifically, an additional 4.6 million predicted spectra and another 11 000 spectra simulated from experimental chemical shifts were deposited into the database. Likewise, the number of NMR raw spectral data depositions has grown from 165 spectra per year to >10 000 per year. As a result of this expansion, the number of monthly webpage views has grown from 55 to 20 000 and the number of monthly visitors has increased from 7 to 2500. To address this growth and to better support the expanding needs of its diverse community of users, many additional improvements to the NP-MRD have been made. These include significant enhancements to the data submission process, notable updates to the database's spectral search utilities and useful additions to support better NMR spectral analysis/prediction. Significant efforts have also been undertaken to remediate and update many of NP-MRD's database entries. This manuscript describes these database improvements and expansion efforts, along with how they have been implemented and what future upgrades to the NP-MRD are planned.
MarkerDB 2.0: a comprehensive molecular biomarker database for 2025.
2025
Jackson H; Oler E; Torres-Calzada C; Kruger R; Hira AS; Lopez-Hernandez Y; Pandit D; Wang J; Yang K; Fatokun O; Berjanskii M; MacKay S; Sajed T; Han S; Woudstra R; Sykes G; Poelzer J; Sivakumaran A; Gautam V; Wong G; Wishart DS
Nucleic acids research
Vol. 53
(D1)
, pp. D1415-D1426
MarkerDB (https://markerdb.ca) has become a leading resource for comprehensive information on molecular biomarkers. Over the past 3 years, the database has evolved significantly, reflecting the dynamic landscape of biomarker research and increasing demands from its user community. This year's update, which is called MarkerDB 2.0, introduces key improvements to enhance the database's usability, consistency and the range of biomarkers covered. These improvements include (i) the addition of thousands of new biomarkers and associated health conditions, (ii) the inclusion of many new biomarker types and categories, (iii) upgraded searches and data filtering functionalities, (iv) new features for exploring and understanding biomarker panels and (v) significantly expanded and improved descriptions. These upgrades, along with numerous minor improvements in content, interface, layout and overall website performance, have greatly enhanced MarkerDB's usability and capacity to facilitate biomarker interpretation across various research domains. MarkerDB remains committed to providing a free, publicly accessible platform for consolidated information on a wide range of molecular (protein, genetic, chromosomal and chemical/small molecule) biomarkers, covering diagnostic, prognostic, risk, monitoring, safety and response-related biomarkers. We are confident that these upgrades and updates will improve MarkerDB's user friendliness, increase its utility and greatly expand its potential applications to many other areas of clinical medicine and biomedical research.
Accurate Prediction of (1)H NMR Chemical Shifts of Small Molecules Using Machine Learning.
2024
Sajed T; Sayeeda Z; Lee BL; Berjanskii M; Wang F; Gautam V; Wishart DS
Metabolites
Vol. 14
(5)
NMR is widely considered the gold standard for organic compound structure determination. As such, NMR is routinely used in organic compound identification, drug metabolite characterization, natural product discovery, and the deconvolution of metabolite mixtures in biofluids (metabolomics and exposomics). In many cases, compound identification by NMR is achieved by matching measured NMR spectra to experimentally collected NMR spectral reference libraries. Unfortunately, the number of available experimental NMR reference spectra, especially for metabolomics, medical diagnostics, or drug-related studies, is quite small. This experimental gap could be filled by predicting NMR chemical shifts for known compounds using computational methods such as machine learning (ML). Here, we describe how a deep learning algorithm that is trained on a high-quality, "solvent-aware" experimental dataset can be used to predict (1)H chemical shifts more accurately than any other known method. The new program, called PROSPRE (PROton Shift PREdictor) can accurately (mean absolute error of <0.10 ppm) predict (1)H chemical shifts in water (at neutral pH), chloroform, dimethyl sulfoxide, and methanol from a user-submitted chemical structure. PROSPRE (pronounced "prosper") has also been used to predict (1)H chemical shifts for >600,000 molecules in many popular metabolomic, drug, and natural product databases.
One of the major challenges currently faced by global health systems is the prolonged COVID-19 syndrome (also known as "long COVID") which has emerged as a consequence of the SARS-CoV-2 epidemic. It is estimated that at least 30% of patients who have had COVID-19 will develop long COVID. In this study, our goal was to assess the plasma metabolome in a total of 100 samples collected from healthy controls, COVID-19 patients, and long COVID patients recruited in Mexico between 2020 and 2022. A targeted metabolomics approach using a combination of LC-MS/MS and FIA MS/MS was performed to quantify 108 metabolites. IL-17 and leptin were measured in long COVID patients by immunoenzymatic assay. The comparison of paired COVID-19/long COVID-19 samples revealed 53 metabolites that were statistically different. Compared to controls, 27 metabolites remained dysregulated even after two years. Post-COVID-19 patients displayed a heterogeneous metabolic profile. Lactic acid, lactate/pyruvate ratio, ornithine/citrulline ratio, and arginine were identified as the most relevant metabolites for distinguishing patients with more complicated long COVID evolution. Additionally, IL-17 levels were significantly increased in these patients. Mitochondrial dysfunction, redox state imbalance, impaired energy metabolism, and chronic immune dysregulation are likely to be the main hallmarks of long COVID even two years after acute COVID-19 infection.
MagMet: A fully automated web server for targeted nuclear magnetic resonance metabolomics of plasma and serum.
2023
Rout M; Lipfert M; Lee BL; Berjanskii M; Assempour N; Fresno RV; Cayuela AS; Dong Y; Johnson M; Shahin H; Gautam V; Sajed T; Oler E; Peters H; Mandal R; Wishart DS
Magnetic resonance in chemistry : MRC
Vol. 61
(12)
, pp. 681-704
Nuclear magnetic resonance (NMR) spectral analysis of biofluids can be a time-consuming process, requiring the expertise of a trained operator. With NMR becoming increasingly popular in the field of metabolomics, there is a growing need to change this paradigm and to automate the process. Here we introduce MagMet, an online web server, that automates the processing and quantification of 1D (1) H NMR spectra from biofluids-specifically, human serum/plasma metabolites, including those associated with inborn errors of metabolism (IEM). MagMet uses a highly efficient data processing procedure that performs automatic Fourier Transformation, phase correction, baseline optimization, chemical shift referencing, water signal removal, and peak picking/peak alignment. MagMet then uses the peak positions, linewidth information, and J-couplings from its own specially prepared standard metabolite reference spectral NMR library of 85 serum/plasma compounds to identify and quantify compounds from experimentally acquired NMR spectra of serum/plasma. MagMet employs linewidth adjustment for more consistent quantification of metabolites from higher field instruments and incorporates a highly efficient data processing procedure for more rapid and accurate detection and quantification of metabolites. This optimized algorithm allows the MagMet webserver to quickly detect and quantify 58 serum/plasma metabolites in 2.6 min per spectrum (when processing a dataset of 50-100 spectra). MagMet's performance was also assessed using spectra collected from defined mixtures (simulating other biofluids), with >100 previously measured plasma spectra, and from spiked serum/plasma samples simulating known IEMs. In all cases, MagMet performed with precision and accuracy matching the performance of human spectral profiling experts. MagMet is available at http://magmet.ca.
Residual feed intake (RFI) is a feed efficiency measure commonly used in the livestock industry to identify animals that efficiently/inefficiently convert feed into meat or body mass. Selection for low-residual feed intake (LRFI), or feed efficient animals, is gaining popularity among beef producers due to the fact that LRFI cattle eat less and produce less methane per unit weight gain. RFI is a difficult and time-consuming measure to perform, and therefore a simple blood test that could distinguish high-RFI (HRFI) from LRFI animals (early on) would potentially benefit beef farmers in terms of optimizing production or selecting which animals to cull or breed. Using three different metabolomics platforms (nuclear magnetic resonance (NMR) spectrometry, liquid chromatography-tandem mass spectrometry (LC-MS/MS), and inductively coupled plasma mass spectrometry (ICP-MS)) we successfully identified serum biomarkers for RFI that could potentially be translated to an RFI blood test. One set of predictive RFI biomarkers included formate and leucine (best for NMR), and another set included C4 (butyrylcarnitine) and LysoPC(28:0) (best for LC-MS/MS). These serum biomarkers have high sensitivity and specificity (AUROC > 0.85), for distinguishing HRFI from LRFI animals. These results suggest that serum metabolites could be used to inexpensively predict and categorize bovine RFI values. Further validation using a larger, more diverse cohort of cattle is required to confirm these findings.
A simple method to measure protein side-chain mobility using NMR chemical shifts.
2013
Berjanskii MV; Wishart DS
Journal of the American Chemical Society
Vol. 135
(39)
, pp. 14536-9
Protein side-chain motions are involved in many important biological processes including enzymatic catalysis, allosteric regulation, and the mediation of protein-protein, protein-DNA, protein-RNA, and protein-cofactor interactions. NMR spectroscopy has long been used to provide insights into the motions of side-chain groups. Currently, the method of choice for studying side-chain dynamics by NMR is the measurement of methyl (2)H autorelaxation. Methyl (2)H autorelaxation exhibits simple relaxation mechanisms and can be straightforwardly converted to meaningful dynamic parameters. However, methyl groups can only be found in 6 of 19 side-chain bearing amino acids. Consequently, only a sparse assessment of protein side-chain dynamics is possible. Therefore, there is a significant interest in developing novel methods of studying side-chain motions that can be applied to all types of side-chains. Here, we show how side-chain chemical shifts can be used to determine the magnitude of fast side-chain motions in proteins. The chemical shift method is applicable to all side-chain bearing residues and does not require any additional measurements beyond standard NMR experiments for backbone and side-chain assignments.
Comparative analysis of prion proteins for evolutionarily diverse vertebrate species, polymorphic variants and mutants-Structure and essential dynamics
2012
M Stepanova, B Issack, K Santo, T Fito, M Berjanskii, D Wishart
Structural domains and main-chain flexibility in prion proteins.
2009
Blinov N; Berjanskii M; Wishart DS; Stepanova M
Biochemistry
Vol. 48
(7)
, pp. 1488-97
In this study we describe a novel approach to define structural domains and to characterize the local flexibility in both human and chicken prion proteins. The approach we use is based on a comprehensive theory of collective dynamics in proteins that was recently developed. This method determines the essential collective coordinates, which can be found from molecular dynamics trajectories via principal component analysis. Under this particular framework, we are able to identify the domains where atoms move coherently while at the same time to determine the local main-chain flexibility for each residue. We have verified this approach by comparing our results for the predicted dynamic domain systems with the computed main-chain flexibility profiles and the NMR-derived random coil indexes for human and chicken prion proteins. The three sets of data show excellent agreement. Additionally, we demonstrate that the dynamic domains calculated in this fashion provide a highly sensitive measure of protein collective structure and dynamics. Furthermore, such an analysis is capable of revealing structural and dynamic properties of proteins that are inaccessible to the conventional assessment of secondary structure. Using the collective dynamic simulation approach described here along with a high-temperature simulations of unfolding of human prion protein, we have explored whether locations of relatively low stability could be identified where the unfolding process could potentially be facilitated. According to our analysis, the locations of relatively low stability may be associated with the beta-sheet formed by strands S1 and S2 and the adjacent loops, whereas helix HC appears to be a relatively stable part of the protein. We suggest that this kind of structural analysis may provide a useful background for a more quantitative assessment of potential routes of spontaneous misfolding in prion proteins.
A simple method to predict protein flexibility using secondary chemical shifts.
2005
Berjanskii MV; Wishart DS
Journal of the American Chemical Society
Vol. 127
(43)
, pp. 14970-1
Protein motions play a critical role in many biological processes, such as enzyme catalysis, allosteric regulation, antigen-antibody interactions, and protein-DNA binding. NMR spectroscopy occupies a unique place among methods for investigating protein dynamics due to its ability to provide site-specific information about protein motions over a large range of time scales. However, most NMR methods require a detailed knowledge of the 3D structure and/or the collection of additional experimental data (NOEs, T1, T2, etc.) to accurately measure protein dynamics. Here we present a simple method based on chemical shift data that allows accurate, quantitative, site-specific mapping of protein backbone mobility without the need of a three-dimensional structure or the collection and analysis of NMR relaxation data. Further, we show that this chemical shift method is able to quantitatively predict per-residue RMSD values (from both MD simulations and NMR structural ensembles) as well as model-free backbone order parameters.
PROTEIN STRUCTURE AND FOLDING-NMR structure of the N-terminal J domain of murine polyomavirus T antigens. Implications for DnaJ-like domains and for mutations of T antigens.
2000
MV Berjanskii, MI Riley, A Xie, V Semenchenko, WR Folk, SRV Doren
TIMP-1 contact sites and perturbations of stromelysin 1 mapped by NMR and a paramagnetic surface probe.
1998
Arumugam S; Hemme CL; Yoshida N; Suzuki K; Nagase H; Berjanskii M; Wu B; Van Doren SR
Biochemistry
Vol. 37
(27)
, pp. 9650-7
Surfaces of the 173 residue catalytic domain of human matrix metalloproteinase 3 (MMP-3(DeltaC)) affected by binding of the N-terminal, 126 residue inhibitory domain of human TIMP-1 (N-TIMP-1) have been investigated using an amide-directed, NMR-based approach. The interface was mapped by a novel method that compares amide proton line broadening by paramagnetic Gd-EDTA in the presence and absence of the binding partner. The results are consistent with the X-ray model of the complex of MMP-3(DeltaC) with TIMP-1 (Gomis-Ruth et al. (1997) Nature 389, 77-81). Residues Tyr155, Asn162, Val163, Leu164, His166, Ala167, Ala169, and Phe210 of MMP-3(DeltaC) are protected from broadening by the Gd-EDTA probe by binding to N-TIMP-1. N-TIMP-1-induced exposure of backbone amides of Asp238, Asn240, Gly241, and Ser244 of helix C of MMP-3(DeltaC) to Gd-EDTA confirms that the displacement of the N-terminus of MMP-3(DeltaC) occurs not only in the crystal but also in solution. These results validate comparative paramagnetic surface probing as a means of mapping protein-protein interfaces. Novel N-TIMP-1-dependent changes in hydrogen bonding near the active site of MMP-3(DeltaC) are reported. N-TIMP-1 binding causes the amide of Tyr223 of MMP-3(DeltaC) bound by N-TIMP-1 to exchange with water rapidly, implying a lack of the hydrogen bond observed in the crystal structure. The backbone amide proton of Asn162 becomes protected from rapid exchange upon forming a complex with N-TIMP-1 and could form a hydrogen bond to N-TIMP-1. N-TIMP-1 binding dramatically increases the rate of amide hydrogen exchange of Asp177 of the fifth beta strand of MMP-3(DeltaC), disrupting its otherwise stable hydrogen bond.