Deep dive on HuProt™ proteome microarray technology and QA/QC.
HuProt™ Microarray Production. The HuProt™ Human Proteome Microarray is the most comprehensive human proteome array created to date (Jeong et al, 2012). It contains over 21,000 human proteins and protein isoforms, including >81% of canonically expressed proteins as defined by the Human Protein Atlas, and allows hundreds of interactions to be profiled in high-throughput. HuProt™ can be used for a wide range of applications-this includes mapping antigen-specific immunity as multi-isotype profiles in serum, determining monoclonal antibody specificity, and studying protein-protein interaction, substrate identification, protein-DNA binding, protein-RNA binding, and binding of some small molecules. CDI Labs’ latest version of the array, HuProt™ v4.0, contains >81% of human proteins in each major functional Gene Ontology protein category (Venkataraman et al, 2018).
Creation of HuProt™ Library. The HuProt™ library clones are derived from four sources – the Ultimate Human ORF Collection (Invitrogen, CA); the NIH ORFeome effort (synthesized by GeneCopoeia, Inc.; Rual JF et al, 2004); and entry clones from the laboratories of Heng Zhu and Seth Blackshaw (The Johns Hopkins University). Using the Gateway recombinant cloning system (Invitrogen, CA), human ORFs were shuttled from the entry clones to a yeast high-copy expression vector (pEGH-A) that produces GST-His6 fusion proteins under the control of the galactose-inducible GAL1 promoter. Plasmids were rescued into E. coli and verified by restriction endonuclease digestion. Plasmids with inserts of correct size were transformed into yeast for protein purification (Hu S et al, 2009; Jeong J et al, 2012)
Validating and Curating Clones used in HuProt™. To check and confirm the identity of each human ORF in the HuProt™ libary, bidirectional Sanger sequencing was conducted on both the entry clones and the yeast expression vectors that were derived from them (Venkataraman A et al, 2018). Blast+ was used to align the ORF sequence to multiple public databases (UniProt, CCDS, RefSeq, and Ensembl) to generate an integrated alignment score for each clone. If a clone covered the entire sequence of a known protein, the clone is considered full length (F), whereas partial matches were regarded as indicative of truncated (TRUNC) clones. Because the source clones included ORFs containing untranslated regions, unannotated splice variants, and single-nucleotide polymorphisms, the clones were categorized into groups ranging from perfect matches to the known protein-coding transcriptome, to as-yet potential protein-coding ORFs that are not yet reviewed. A detailed breakdown of this classification, along with the threshold parameters, can be accessed at https://collection.cdi-lab.com/public.
Protein Purification from the HuProt™ Library. Proteins were purified from yeast transformed with expression vectors encoding the human ORFs. Human proteins were purified as GST-His6 fusion proteins from yeast using a previously described high-throughput purification protocol (Hu S et al, 2009; Zhu et al., 2001). Using a 96-well format, the samples are purified from yeast extracts using glutathione-agarose beads. 0.1% Triton is included in the lysis buffer and washers to ensure that the purified proteins are free of lipids.
Protein Microarray Production & Testing. The purified human proteins were arrayed in a 384-well format and printed on PATH slides (GraceBio, USA), using an Arrayjet UltraMarathon printer (Arrayjet, UK) to create a block format (should we discuss the format?). Arrays that show >95% of the spots with a foreground/background signal (F/B) ratio of at least 1.5 in an anti-GST assay are classified usable. A number of controls that are reactive with secondary detection reagents are included on HuProt™. Controls include titrated GST protein, histones, mouse and rabbit anti-biotin, mouse IgM, and biotin-tagged control for streptavidin detection. Each block also contains a row of control spots, including Alex Fluor 555/647 as landmarks.
Tests show that HuProt™ arrays contain a majority of the annotated, full-length proteome in native conformation. Tests on HuProt™ show that the proteins are folded in native conformation and retain function (Hu S. poster; Venkataraman A., et al., 2018). When both native and denatured HuProt™ arrays were probed with monoclonal antibodies that selectively recognize either linear or folded epitopes of their cognate antigen, the antibodies were found to recognize the appropriate antigen form (Venkataraman A., et al., 2018). Further tests (RNA binding) showed that proteins on HuProt™ do retain function (Venkataraman A., et al., 2018).