IEDB Inclusion Criteria

From curation_manual
Revision as of 18:23, 11 June 2008 by Rvita (Talk | contribs)

Jump to: navigation, search

Inclusion/Exclusion Criteria

All articles and epitopes must meet inclusion criteria in order to be included in the database.

Relevant Experimental Data

Experimental Data

The reference must report original and experimental epitope-related data.

  • Computer derived predictions without functional experimental data will not be included in the database.
  • Sequence analysis of defined epitopes will not be included in the database unless novel information is provided (e.g., identification of anchor residues).
  • Reviews and meta analysis will not be included in the database.
  • Data not shown is not curated.
  • Personal communication is not sufficient for curation.
  • When encountering data from previous publications which are included in figures or tables, the curator should determine if the previous record was curated and if it was not, the PMID should be sent to the Document Specialist to be included in the database.

Scope and Exclusions

The experimental data must fall within the scope of the database.

Relevant data includes:

  • MHC binding data
  • Epitope elution from MHC (Naturally processed MHC ligands)
  • T cell responses to an epitope (including NK T cells)
  • B cell/antibody responses to an epitope

Certain categories of experimental data will be specifically excluded from the database.

Exclusion: NK Epitopes

Epitopes that are recognized by Natural Killer cells (non-T cell) will be excluded from the database. Please note: NK T cell epitopes/ligands will be included (as noted in section (#Scope and Exclusions).

Exclusion: Non-Immunological Interpretation of "Epitope"

Experimental data describing "epitopes" in non-immunological contexts will not be included in the database. For example, the structures that are in contact in protein-protein interactions are sometimes referred to as an epitope. This interpretation of "epitope" will not be included in the database.

Exclusion: Epitope Tags

References discussing epitope tags utilized as a technical tool for immunoprecipitation, purification, and similar experiments will be excluded from the database.

Exclusion: Superantigen

References discussing superantigens in the context of peptide-MHC-super antigen complexes will be excluded from the database. However, B cell/antibody responses to superantigens (especially Staphylococcus enterotoxin B (SEB), a NIAID Category B priority pathogen) will be curated.

Important Note: In the event superantigen is used as part of an assay to stabilize the interaction of an epitope with MHC, the epitope studied is curatable and should be captured.

Exclusion: TCR Antagonism

Data related to TCR antagonism will not be curated. Author stated TCR antagonism will not be curated, however, TCR competition not specifically labeled as TCR antagonism by the authors will be curated. TCR antagonism used in an MHC binding inhibition assay will be captured as an MHC binding assay.

Exclusion: Antigen Processing

Data concerning the processing of antigens generated in order to study the effects of variables on processing rather than on the study of an epitope (i.e. the epitope is irrelevant) are not to be included in the database unless the identification of a novel epitope is demonstrated.

Exclusion: Adoptive Transfer

Assays involving adoptive transfer will not be curated at this time.

Epitopes Relevant to IEDB

Length/Mass Restrictions

Table 1. Length/Mass Restrictions
Epitope Class Uncuratable Epitopic Region (# residues) Epitope
(# residues)
B-cell > 5 kDa or 50 amino acids 12 - 50 1 ≤ x ≤ 11
Class I > 5 kDa or 50 amino acids 12 - 50 7 ≤ x ≤ 11
Class II > 5 kDa or 50 amino acids 16 - 50 7 ≤ x ≤ 15

The database will only include epitopes of less than 50 residues in either a linear or conformational sequence. If the epitope is non-peptidergic, the mass restriction is to be less than 5000 Daltons to be included in the database.

Important Exception: Epitopes greater than 50 residues will be curated for certain pathogens including Botulinum toxin and anthrax epitopes.

A region or fragment of >50aa from B. anthracis and C. botulinum will be curated as an epitopic region in the following cases:

Figure 1. Curation of Botulinum toxin and Anthrax epitopes.

Well-Characterized Epitopes

To broaden the spectrum of information in the database, we currently exclude the repeated curation of epitopes once 10 key references have been included in the database. The original articles describing the epitope, MHC restriction data, antibody responses, and articles containing novel information regarding the epitope will be included in the approximately 10 references. A compiled list of "well-characterized" epitopes is listed in Table 2 and can also be found in the Curation Network folder (\\Curation\CurationNotes\blacklisted.xls).

Table 2. Well-characterized epitopes
# Common Name Sequence Positions Source Species Source Protein Name Restriction Allele
1 TT Universal Helper epitope QYIKANSKFIGITE 830-843 Clostridium tetani Tetanus toxin DRB1*1302
2 OVA 257-264 SIINFEKL 257-264 Gallus gallus (Chicken) Ovalbumin Kb
3 Ova 323-339 ISQAVHAAHAEINEAGR 323-339 Gallus gallus (Chicken) Ovalbumin H2-Ag7, RT1.B1
4 HEL 46-61 NTDGSTDYGILQINSR 46-61 Gallus gallus (Chicken) Hen Egg white Lysozyme H-2 IAk
5 HBV core 18-27 FLPSDFFPSV 18-27 Hepatitis B Core Protein A2
6 SL9 SLYNTVATL 77-85 HIV Gag-p17 A*0201
7 TAX 11-19 LLFGYPVYV 11-19 HTLV Transcriptional activator (tax) A*0201
8 SYFPEITHI SYFPEITHI 367-375 Human Tyrosine kinase JAK1 Kd
9 MART-1(27-35) AAGIGILTV 27-35 Human MART-1 (Tumor antigen) A*0201
10 NY-ESO-1 epitope SLLMWITQC 157-165 Human Cancer/testis antigen NY-ESO-1 A*0201
11 Tyrosinase 370D YMDGTMSQV 368-376 Human Tyrosinase (Tumor antigen) A*0201
12 gp100 210M IMDQVPFSV 209-217 Human Glycoprotein (melanocyte lineage-specific antigen) A2
13 HER-2/neu 689-697 RLLQETELV 689-697 Human HER-2/neu (Tumor antigen) A*0201
14 HER-2/neu 369-377 KIFGSLAFL 369-377 Human HER-2/neu (Tumor antigen) A*0201
15 PLP 136-151 HSLGKWLGHPDKF 136-151 Human Myelin proteolipid protein H-2 IAs
16 MBP Ac1-11 Ac-ASQKRPSQRSK 1-11 Human Myelin Basic protein H-2 IAu
17 CMV pp65 NLVPMVATV 495-503 Human cytomegalovirus Phosphoprotein 65 (pp65) A*0201
18 M1 GILGFVFTL 58-66 Influenza Matrix Protein A*0201
19 Flu HA 307-319 PKYVKQNTLKLAT 307-319 Influenza Haemagglutinin DRB1*0101, DRB1*0401 (DR4Dw4), DRB1*0701, DRB1*1101, DRB5*0101(DR2a)
20 PA 224-233 SSLENFRAYV 224-233 Influenza Acid Polymerase H-2 Db
21 NP 366-374 ASNENMETM 366-374 Influenza Nucleoprotein H-2 Db
22 NP 147-155 TYQRTRALV 147-155 Influenza Influenza Nucleoprotein H-2 Kd
23 HA 110-120 SFERFEIFPKE 110-120 Influenza Haemagglutinin H-2 IEd
24 LCMV gp33 KAVYNFATC 33-41 LCMV Glycoprotein Kb, Db
25 LCMV gp 276 SGVENPGGYCL 276-286 LCMV Glycoprotein Db
26 LCMV np 396 FQPQNGQFI 396-404 LCMV Nucleoprotein Db
27 LCMV np 118-126 RPQASGYM 118-126 LCMV Nucleoprotein H-2 Ld
28 LLO 91-99 GYKDGNEYI 91-99 Listeria monocytogenes Listeriolysin O H-2 Kd
29 LLO 215-226 SQLIAKFGTAFK 215-226 Listeria monocytogenes Listeriolysin O H-2 IEk
30 LLO 190-201 NEKYAQAYPNVS 190-201 Listeria monocytogenes Listeriolysin O H-2 IAb
31 P60 449-457 IYVGNGQMI 449-457 Listeria monocytogenes p60 H-2 Kd
32 P60 217-225 KYGVSVQDI 217-225 Listeria monocytogenes p60 H-2 Kd
33 f-MIGWII MIGWII 1-6 Listeria monocytogenes LemA H2-M3
34 NANP NANP Plasmodium falciparum (malaria) Circumsporozoite protein
35 MCC 88-103 SYIPSAEKI 252-260 Plasmodium berghei Circumsporozoite protein H-2 Kd
36 MCC 88-103 ANERADLIAYLKQATK 88-103 Macrobrachium malcolmsonii (moth) Cytochrome c H-2 IEk
37 Hsp 234-252 LREAAEKAKIELSSSQSTS 234-252 Mycobacterium tuberculosis Heat shock protein (HSP) 70 RT1.B
38 PCC 88-104 KAERADLIAYLKQATAK 88-104 Pigeon Cytochrome c H-2 IEk

The above list plus any well characterized epitopes from additional sources encountered as the range of curated subjects is expanded are to be curated following the exceptions noted below. We avoid curating papers that focus on exploring strategies to enhance well characterized epitope's immunogenicity. These papers use well-characterized peptides to evaluate variables such as alternative immunogen constructs (MAPS/vectors) or adjuvants.

Important Note: When a well-characterized epitope is presented in the context of an epitopic region or domain, the longer peptide will be considered a well-characterized epitope.

Important Exceptions:

  • When well-characterized epitopes are studied in a novel host in terms of the MHC, TCR, antibody, or otherwise, the reference will be included in the database.
  • When well-characterized epitopes are used as controls they will not be curated. However, when the well-characterized epitope is used as a reference for a novel epitope from the same source protein, the well-characterized epitope will also be captured.
  • When a reference describes a well-characterized epitope in addition to other curatable epitopes, all of the epitopes from the reference will be captured.
  • At the curators/EC discretion, papers describing new/additional original data relating to these epitopes can and will be curated. For example, if the paper make a novel discovery about the epitope.


References describing only epitopes derived from HIV/SIV will not be included in the database. However, when a manuscript describes epitopes from other relevant sources as well as HIV/SIV, all of the epitopes in the manuscript will be captured.

Curation Prioritization

Epitope curation should be conducted in the following priority order:

A) NIAID Category A, B, and C priority pathogens and toxins:

The complete list of NIAID Category A, B, and C priority pathogens and toxins can be found at the following URL:

NIAID – Category A

Bacillus anthracis (anthrax)

Clostridium botulinum

Yersinia pestis

Variola major (smallpox) and other pox viruses

Francisella tularensis (tularemia)

Viral hemorrhagic fevers


LCM, Junin virus, Machupo virus, Guanarito virus

Lassa Fever



Rift Valley Fever






NIAID – Category B

Burkholderia pseudomallei

Coxiella burnetii (Q fever)

Brucella species (brucellosis)

Burkholderia mallei (glanders)

Ricin toxin (from Ricinus communis)

Epsilon toxin of Clostridium perfringens

Staphylococcus enterotoxin B

Typhus fever (Rickettsia prowazekii)

Food and Waterborne Pathogens


Diarrheagenic E.coli

Pathogenic Vibrios

Shigella species


Listeria monocytogenes

Campylobacter jejuni

Yersinia enterocolitica)

Viruses (Caliciviruses, Hepatitis A)


Cryptosporidium parvum

Cyclospora cayatanensis

Giardia lamblia

Entamoeba histolytica



Additional viral encephalitides

West Nile Virus


California encephalitis




Japanese Encephalitis Virus

Kyasanur Forest Virus

NIAID – Category C

Tickborne hemorrhagic fever viruses

Crimean-Congo Hemorrhagic fever virus

Tickborne encephalitis viruses

Yellow fever

Multi-drug resistant TB


Other Rickettsias


Severe acute respiratory syndrome-associated coronavirus (SARS-CoV)

B) Emerging and Re-emerging pathogens:

Pathogens newly recognized in the past two decades


Australian bat lyssavirus

Babesia, atypical

Bartonella henselae


Encephalitozoon cuniculi

Encephalitozoon hellem

Enterocytozoon bieneusi

Helicobacter pylori

Hendra or equine morbilli virus

Hepatitis C

Hepatitis E

Human herpesvirus 8

Human herpesvirus 6

Lyme borreliosis


Parvovirus B19

Coronaviruses/Severe Acute Respiratory Syndrome (SARS)

Re-emerging Pathogens

Enterovirus 71

Prion diseases

Streptococcus, group A

Staphylococcus aureus

  • Coccidioides immitis

C) Transplant rejection antigens and other alloantigens, Allergens, and Self antigens involved in autoimmunity

D) Infectious diseases not listed above under sections (#Curation Prioritization) A) and B)

E) Epitopes associated with cancer

Minimal Data Requirements

The reference must contain information for all required fields for at least one epitope in order to be included in the database. There are five major categories containing twenty-six fields. One set of data from each category must be available. These fields are highlighted in yellow in the data dictionary available in the Curation Network folder: (\\Curation\Docs\DataDictionary\). The required fields are listed in Table 3

Table 3 Required Fields
# Section Classification Field Name Comments
1 a   Reference -Journal Article PubMed ID At least one set of fields from Category # 1 (1a, 1b or 1c) has to be filled out.
  b i Reference - Submission Author(s)
    ii Reference - Submission Affiliation(s)
  c   Reference - Patents Paten Publication Number
2 a   Epitope Structure Linear Sequences At least one of the three fields from Category # 2 (2a,2b or 2c) has to be filled out.
  b   Epitope Structure SMILES Structure
  c   Epitope Structure Conformational Sequence
3 a   Epitope Structure Epitopic Region / Domain Mandatory field. This Boolean field indicates whether the epitope that is captured is a minimal epitope or contained within a region / domain.
4 a   Epitope Source Epitope Source Nature At least one of the seven fields from Category # 4 (4a, 4b, 4c, 4d, 4e, 4f or 4g) has to be filled out. If the value of natural Antigen, which is a Boolean field, is ’no’, all other Epitope-Source fields are ignored.
  b   Epitope Source Source Species
  c   Epitope Source Gene Name
  d   Epitope Source Protein Name
  e   Epitope Source GenBank ID
  f   Epitope Source Swiss Prot ID
  g   Epitope Source PDB ID
5 a i MHC Binding MHC Allele At least one set of fields from Category # 5 (5a, 5b, 5c, 5d) has to be filled out. All the fields in a subsection has to be filled out if that subsection is selected. For example, if 5a is chosen, all three fields (5a-i,ii,iii) have to be filled out. The following fields - Assay Type, and Qualitative Measurement, can be entered as "Unknown" if the data is unavailable. It’s anticipated that most the data imports from other existing databases might not have the assay related fields
    ii MHC Binding Assay Type
    iii MHC Binding Qualitative Measurement
  b i MHC Ligand Elution MHC Allele
    ii MHC Ligand Elution Assay Type
    iii MHC Ligand Elution Qualitative Measurement
  c i T Cell Response - Assay MHC Allele
    ii T Cell Response - Assay Assay Type
    iii T Cell Response - Assay Qualitative Measurement
  d i B Cell Response - Assay Assay Type
    ii B Cell Response - Assay Qualitative Measurement

Epitope Structure Availability

In the event the exact epitope structure is not given in the reference, follow these guidelines:

I) Contact the corresponding author(s) using the template contact letter provided in the Curation Folder based upon information given in the manuscript. In the internal Curation Tracking System (CTS):

A) Fill out F2: Status should be "Waiting for author’s response". Enter author’s e-mail address and date of e-mail in Comments section of F2 form.
B) Once author provides structure(s): Complete the curation of the article, including "Provided by author" in the Data Location Field in Epitope Structure.
C) Update F2, adding comments to include by whom and when the information was sent.

II) If an e-mail address is not provided in the article, a reference cited by the manuscript regarding the epitope structure may be used for sequence information. The [Epitope Structure] Data Location Field should state where the information was found as cited reference or author communication.

The following format should be used anytime literature is cited in the IEDB: Sette et al. (2007). Nature 48: 1141-7. [PMID:12345678]

When citing epitope location, “Reference cited” should be used in location field and citation information should be placed in comments. Provide the PMID whenever it can be easily obtained.

III) If doubts persist regarding the epitope structure, the data cannot be included in the database. If the author does not respond within two weeks, the reference will be deemed uncuratable. Curation status in Form F2 should be selected as <Uncuratable: Reference Scan> with comments including the reason(s) why the epitope/reference was deemed uncuratable.

Important Note: Cited References The same guidelines are to be applied when researching context information such as the generation of T cell clones or monoclonal antibodies. When researching clone production and mAb generation, only go to references cited by the paper that you are curating. If those references do not provide the needed information, do not continue to search further. Add an immunization comment stating that the details were not provided and clarify this point to the reviewer on the cover sheet, letting the reviewer know that you did look at the references, but still could not find the needed information.