HuProt™: THE HUMAN PROTEOME MICROARRAY
The world’s largest collection of full-length human proteins.
There have been other protein microarrays, but none were made from a protein library as comprehensive or thoroughly validated as HuProt™. Created by faculty at the High Throughput Biology (HIT) Center at the Johns Hopkins University School of Medicine, HuProt was the brainchild of CDI co-founders Jef Boeke, Heng Zhu, Dan Eichinger, & Seth Blackshaw. Original development on HuProt was funded by the NIH Common Fund Support for the Development of Protein Capture Reagents and Technologies, a major project which resulted in HuProt-validated monospecific monoclonal antibodies for human transcription factors. These antibodies, along with many others, are now sold by CDI as Monomabs™.
The collection starts with sequence-confirmed plasmids, which are used to make >21,000 GST-purified recombinant proteins in yeast. After purification, the GST-tagged proteins are piezoelectrically printed on glass slides in duplicate, along with control proteins (GST, BSA, Histones, IgG, etc.). Slides are barcoded for tracking and archiving. Each microarray batch is routinely evaluated by anti-GST staining to demonstrate quality of expression and printing. Slides can be PATH™ nitrocellulose or SuperEpoxy2™. HuProt arrays have been used to evaluate DNA & RNA nucleotide binding, antibody specificity, small molecule binding, protein-protein interactions, and more to properly folded, three-dimensional human proteins.
Next-generation technology. More content, cleaner data.
We start with sequence-confirmed plasmids, then individually express and GST-purify proteins from S. cerevisiae. Piezoelectric printing is used to spot these in duplicate alongside controls in batches of up to 1000 arrays; quality confirmed with anti-GST QA/QC. Successful folding demonstrated by kinase autophosphorylation assay. Service available.
Broad coverage of the human proteome.
The new HuProt v4.0 consists of >21,000 unique human proteins, isoform variants, and protein fragments – covering 16,794 unique genes. This includes 15,889 of the 19,613 canonical human proteins described in the Human Protein Atlas, with broad coverage across protein subclasses.
Content includes major functional classes such as intracellular proteins, membrane proteins, enzymes, secreted proteins, transcription factors, transporters, GPCRs, cytokines, immune receptors, immune checkpoints, CD markers, ion channels, cytosolic proteins, nuclear receptors. Additionally there is thorough coverage for proteins enriched in major tissues of interest such as testis, cerebral cortex, thyroid gland, skin, fallopian tube, liver, parathyroid, intestine, kidney, spleen, muscle, epididymis, lymph node, bone marrow, adrenal gland, esophagus, heart, appendix, tonsil, prostate, rectum, adipose tissue, stomach, colon, cervix, uterus, gallbladder, seminal vesicle, breast, ovary, endometrium, smooth muscle, salivary gland, pancreas, and bladder.
Reproducible protein distribution.
CDI’s non-contact piezoelectric ‘inkjet’ process uses next-generation ArrayJet printers and allows for rapid production of high quality microarray slides time after time. Versus older contact pin printing methods – HuProt™ arrays are made with improved accuracy and reproducibility with excellent spot morphology.
Reproducible serum profiling.
Reproducible Proteome-Wide IgG Autoantibody Immunoprofiling of a Healthy Human Male Within and Across HuProt Proteome Microarray Batches. Serum was collected from a healthy adult human male donor, incubated on pairs (Rep1, Rep2) of HuProt proteome microarrays across three print batches (Batch 1 Feb12_2020, Batch 2 Dec09_2019, Batch 3 Oct01_2019), and stained with anti-IgG (red) & anti-IgA (green) secondaries. Raw data were plotted on a log scale and linear regression analysis was performed. Intra-lot correlations of spot pair averages (red boxes) was >.95 R^2 within all three batches. Slide-to slide cross pairings across all possible pairs of the six slides was a >.90 R^2 correlation – demonstrating robust reproducibility of HuProt microarray data between any individual slide; these results demonstrate multi-isotype analysis requiring multiple slides should be reliable.
Learn more about our ANTYGEN™ HuProt™ analysis services.
Deep dive on HuProt™ proteome microarray technology and QA/QC.
HuProt™ Microarray Production. The HuProt™ Human Proteome Microarray is the most comprehensive human proteome array created to date (Jeong et al, 2012). It contains over 21,000 human proteins and protein isoforms, including >81% of canonically expressed proteins as defined by the Human Protein Atlas, and allows hundreds of interactions to be profiled in high-throughput. HuProt™ can be used for a wide range of applications-this includes mapping antigen-specific immunity as multi-isotype profiles in serum, determining monoclonal antibody specificity, and studying protein-protein interaction, substrate identification, protein-DNA binding, protein-RNA binding, and binding of some small molecules. CDI Labs’ latest version of the array, HuProt™ v4.0, contains >81% of human proteins in each major functional Gene Ontology protein category (Venkataraman et al, 2018).
Creation of HuProt™ Library. HuProt™ library clones were derived from public ORF libraries or independently synthesized; entry clones are from the laboratories of Heng Zhu and Seth Blackshaw (The Johns Hopkins University). Using the Gateway recombinant cloning system (Invitrogen, CA), human ORFs were shuttled from the entry clones to a yeast high-copy expression vector (pEGH-A) that produces GST-His6 fusion proteins under the control of the galactose-inducible GAL1 promoter. Plasmids were rescued into E. coli and verified by restriction endonuclease digestion. Plasmids with inserts of correct size were transformed into yeast for protein purification (Hu S et al, 2009; Jeong J et al, 2012)
Validating and Curating Clones used in HuProt™. To check and confirm the identity of each human ORF in the HuProt™ libary, bidirectional Sanger sequencing was conducted on both the entry clones and the yeast expression vectors that were derived from them (Venkataraman A et al, 2018). Blast+ was used to align the ORF sequence to multiple public databases (UniProt, CCDS, RefSeq, and Ensembl) to generate an integrated alignment score for each clone. If a clone covered the entire sequence of a known protein, the clone is considered full length (F), whereas partial matches were regarded as indicative of truncated (TRUNC) clones. Because the source clones included ORFs containing untranslated regions, unannotated splice variants, and single-nucleotide polymorphisms, the clones were categorized into groups ranging from perfect matches to the known protein-coding transcriptome, to as-yet potential protein-coding ORFs that are not yet reviewed. A detailed breakdown of this classification, along with the threshold parameters, can be accessed at https://collection.cdi-lab.com/public.
Protein Purification from the HuProt™ Library. Proteins were purified from yeast transformed with expression vectors encoding the human ORFs. Human proteins were purified as GST-His6 fusion proteins from yeast using a previously described high-throughput purification protocol (Hu S et al, 2009; Zhu et al., 2001). Using a 96-well format, the samples are purified from yeast extracts using glutathione-agarose beads. 0.1% Triton is included in the lysis buffer and washers to ensure that the purified proteins are free of lipids.
Protein Microarray Production & Testing. The purified human proteins were arrayed in a 384-well format and printed on PATH slides (GraceBio, USA), using an Arrayjet UltraMarathon printer (Arrayjet, UK) to create a block format. Arrays that show >95% of the spots with a foreground/background signal (F/B) ratio of at least 1.5 in an anti-GST assay are classified usable. A number of controls that are reactive with secondary detection reagents are included on HuProt™. Controls include titrated GST protein, histones, mouse and rabbit anti-biotin, mouse IgM, and biotin-tagged control for streptavidin detection. Each block also contains a row of control spots, including Alex Fluor 555/647 as landmarks.
Tests show that HuProt™ arrays contain a majority of the annotated, full-length proteome in native conformation. Tests on HuProt™ show that the proteins are folded in native conformation and retain function (Hu S. poster; Venkataraman A., et al., 2018). When both native and denatured HuProt™ arrays were probed with monoclonal antibodies that selectively recognize either linear or folded epitopes of their cognate antigen, the antibodies were found to recognize the appropriate antigen form (Venkataraman A., et al., 2018). Further tests (RNA binding) showed that proteins on HuProt™ do retain function (Venkataraman A., et al., 2018).