Catalog Search Form Help

All search criteria are optional. Searching with multiple criteria will return loci that match ALL criteria. Searches without criteria will return all loci in the catalog. This may be useful for subsequent sequence retrievals.

For multi-select boxes, you may select / de-select multiple entries by holding down the <ctrl> key (windows/linux) or <open-apple> key (apple) while clicking with the mouse.

Name: Term(s) entered are searched against locus names as well as the accession and gene names of matching or related public records. If you're searching for a locus by its LLNL id, enter the number in the following format: LLNL#

You may enter multiple terms separated by any common delimiter (spaces, commas, etc). Multiple terms are treated as an OR search, i.e. loci which match any of the terms will be included in your results. The '%' may be used as a "wildcard"; the search is case insensitive.

Example queries,

  • "znf605" would return ZNF605
  • "llnl77" also returns ZNF605 (its LLNL id is 77)
  • "NM_183238" also returns ZNF605
  • "znf605 znf606" would return both ZNF605 and ZNF606.
  • "znf60%" would return ZNF600, ZNF605, ZNF606, and ZNF607

Position: Enter genomic positions relative to the UCSC hg18 build. You may optionally restrict results to the + or - strand; commas in the coordinates are ignored.

Example queries,

  • "chr19" would return all loci on chromosome 19
  • "chr19:11455245-12601676" returns all loci on chromosome 19 found (completely or partially) between positions 11455245 and 12601676
  • "chr19:11455245-12601676(+)" returns all "+ strand" loci on chromosome 19 found (completely or partially) between positions 11455245 and 12601676

Type: Restrict search to locus type (assigned by LLNL annotators). Selecting multiple types will return loci that match ANY of the selected types (i.e. multiple terms are treated as an OR search). Assignments are made as follows:

  • Known: Previously know locus with existing RefSeq or UCSC Known Gene record(s)
  • Novel: Valid gene model with intact ORF and significant RNA evidence; not previously "Known"
  • Putative: Valid gene model with intact ORF but no significat RNA evidence; not previously "Known"
  • Pseudogene (complete): Full-length pseudogene
  • Pseudogene (fragment): Fragment of a gene
  • Pseudogene (complete / processed): Full-length pseudogene missing intron(s)
  • Pseudogene (fragment / processed): Fragment of a pseduogene missing intron(s)

The assignment of a locus to one of the first three categories is relative to external public databases which are constantly being updated with new information. As such, these assignments will necessarily need to be updated as information becomes available and, as such, they are not intended to be viewed as immutable truths! We will make every effort to keep our data current and welcome any comments.

Example queries,

  • "Known,Novel,Putative" would return loci that are thought to be genes, i.e. pseudogenes would be excluded
  • Choosing all four of the "Pseudogene" options would return loci that are thought to be pseduogenes, i.e. genes would be excluded

Motifs: Restrict your search by effector motifs found within the genomic sequence. It is important to understand that the effector motifs were identified by analysis of the genomic sequence and so may be contained within the region of a given locus WITHOUT being completely transcribed / translated.

Steps to include effector motif-based criteria in your search:
1. Select one or more motifs that you are interested in (additional options will then appear).
2. Define the search behavior for the selected motifs. The choices are intended to be self-explanatory if you note that the meaning of 'contains' is determined in the next step.
3. Define the relationship of the motif to the model that you are interested in.
4. Check/uncheck the "Allow motifs to be partially contained" box. This will determine whether or not a motif is still considered to be 'contained' within the model if only PART of its sequences overlaps the relevant portion of the model. If you do not check the box, the ENTIRE motif as identified in the genomic sequence MUST BE COMPLETELY contained within the region specified in step 3. Generally you probably want this box to be checked. 5. Define whether loci should still be returned if the Representative Model does not meet the criteria specified in steps 1 - 4 above. In other words, the main model does not meet your criteria but there is an alternate transcript that does.

An example,

  • Searches with the following choices: "KRAB-A,SCAN"; "Contains ONE OR MORE of selected motif(s)", "Translated (in frame)"; "Allow motifs to be partially contained" is unchecked would return all loci which contain KRAB-A OR SCAN motifs completely translated in frame.

    Loci in which the motifs are only completely translated in frame in alternative transcripts would only be included if "Any model matches criteria (i.e. including alternate transcripts)" was also selected. Also, since "Allow motifs to be partially contained" was unchecked, loci containing models where all but a few residues of one of the motifs was translated in frame would be excluded. It is often useful to leave this box checked.