How does SVM RNAi increase success rate?
The problem we have is to find a few good siRNA targets. The solution to this problem is simple in principle. All we need to do is to reject more bad siRNA target than good siRNA target, then good targets will be enriched in the remaining pool. Suppose we have 10 good targets among 40 bad targets, if we reject 80% good targets and 95% bad targets, there will be only 2 good targets and 2 bad targets left. We have increased our chance of success by 2.5 folds.
To achieve this, we have optimized our SVM machine to reject as many bad siRNA targets as possible even though we may reject many of the good targets. Enrichment in good targets is ensured because bad targets are 1.3-2.5 times more likely being rejected compared to good targets. See Figure 1 for details.
Does SVM RNAi use the rules such as AARN17YTT?
This is frequently a source of confusion. SVM RNAi is a learning program. It starts with no rules at all. By learning from training data, it comes up with a decision function for testing whether a particular sequence is a good siRNA candidate. The decision function is usually complex and may or may not be interpreted as simple rules.
On the other hand, since the majority of the good siRNAs in the training data do satisfy the Tuschl’s rules (AARN17Y), a large fraction of targets predicted by SVM RNAi do satisfy Tuschl’s rules. However it is not equivalent to say SVM RNAi uses Tuschl’s rules. Many sequences satisfying the Tuschl’s rules are ruled out by SVM RNAi. Apparently, there are additional restrictions from SVM RNAi.
What is the success rate of SVM RNAi?
Before talking about success rate, we must define success.
We set a high standard for a good siRNA target: (1) the knockdown must be detected at the protein level; (2) the level of knockdown must be
reproducible and significant (good for publication).
On the training set SVM RNAi achieved a 10-fold cross validation rate of ~90%. On independent sets, the overall accuracy is 50-70% for SVM RNAi 2
and 60%-80% for SVM RNAi 3.6.
The overall rate is the success rate in predicting both positives and negatives.
SVM RNAi 3.6 is not optimized for best overall success rate. It is optimized to
reduce the false-positive rate --- the rate that matters most to researchers.
SVM RNAi 3.6 can reduce false-positive rate by as much as 50%.
If I choose 100 targets use SVM RNAi 3.6, how many will be good targets?
Unfortunately the success rate can't be computed directly from the SVM model. The best estimate we can give is the fail rate:
SVM RNAi fail rate ~ 50% * (Tuschl's rule fail rate).
Have you trained and tested your model using experimental data that measured mRNA level by real-time RT-PCR?
No, we excluded all data that didn't show a knockdown at the protein level. A small decrease in mRNA level may not lead to a significant decrease in protein level since the translation machinery may compensate for the decrease in mRNA. If a siRNA also suppresses translation, then the change in the protein level will be greater. Furthermore, antisense RNA may also cause a decrease in the mRNA level as measured by real-time RT-PCR (Vickers et al, J Biol Chem. 278(9):7108-18).
What do you mean SVM RNAi 3.6 can reduce false-positive rate by as much as 50%?
If you have designed five siRNAs using the rules most people have
used and only one showed specific siRNA
activity, your false-positive rate is 4/5=80%. Reducing the
false-positive rate by 50% means that the new false-positive rate will
be 80% * 50% = 40%, i.e., only 2 (5 * 40%) siRNAs will not have specific activities if
five targets are selected by SVM RNAi 3.6.
If I reject all siRNA candidates, my false-positive rate will be 0.
SMV RNAi does not randomly reject siRNA targets. It is tested to
ensure that the probability of rejecting a good siRNA target is less than
the probability of rejecting a bad siRNA target. As a result the remaining
pool of siRNAs are enriched in good siRNA targets. See Figure 1 for detailed
statistics.
Why do I need the stand-alone version?
The stand-alone SVM RNAi is version 3.6 while the online one is version 2.0.
Version 3.6 is trained and tested with a larger set of published siRNA sequences compared to version 2.0 (~300 verses ~100 siRNA sequences).
The stand-alone version allows users to eliminate siRNAs with multiple targets. Blast may also be used for this step. Blast is time consuming and results may not be clear especially when close family members exist. It is easy to overlook Blast results.
Why do you eliminate siRNAs with only 15-base match to more than one gene?
Although a single mismatch may abrogate silencing, recent findings suggest that shorter matches may translationally repress untargeted genes (Semizarov et. al., PNAS 100(11), 6347-6352; Doench et. al., Genes Dev. 17(8):991-1008; Jackson et. al., Nature Biotech. 21(6): 635-637). Furthermore 15-base match to untargeted genes may nevertheless decrease effective siRNA concentration. For this reason any siRNA duplex (in two directions) with a potential of binding to more than one gene will be eliminated.
What is the user experience?
User feedbacks so far are in general positive. Performances of SVM 3.6 on data provided by users are
better or as good as on published data.
Does SVM RNAi work for hairpin siRNA?
There is no strong reason to suspect hairpin siRNA will have a different set of rules for target selection. The SVM model is not trained separately for hairpin siRNA.
How do I get SVM RNAi 3.6?
1. Order a license from Chang Bioscience. We accept PO # order and credit card order. Permanent license will be $225.
2. If you prefer downloading SVM RNAi 3.6 on the web, please click here. Otherwise the program will
be mailed to you on a CD. Note the CD version contains genome data of 14 species. If you download the software, you must
download the genome data from NCBI separately. Click here for instructions.
|