<?xml version="1.0" encoding="UTF-8"?><rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:georss="http://www.georss.org/georss" xmlns:geo="http://www.w3.org/2003/01/geo/wgs84_pos#" xmlns:media="http://search.yahoo.com/mrss/"
		>
<channel>
	<title>Comments on: RNA Journal Club 6/25/09</title>
	<atom:link href="http://youdpreferanargonaute.com/2009/06/25/rna-journal-club-62509/feed/" rel="self" type="application/rss+xml" />
	<link>http://youdpreferanargonaute.com/2009/06/25/rna-journal-club-62509/</link>
	<description>Why I Like The RNAs</description>
	<lastBuildDate>Fri, 15 Apr 2011 04:16:37 +0000</lastBuildDate>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.com/</generator>
	<item>
		<title>By: Chi</title>
		<link>http://youdpreferanargonaute.com/2009/06/25/rna-journal-club-62509/#comment-194</link>
		<dc:creator><![CDATA[Chi]]></dc:creator>
		<pubDate>Mon, 27 Jul 2009 15:50:31 +0000</pubDate>
		<guid isPermaLink="false">http://youdpreferanargonaute.com/?p=385#comment-194</guid>
		<description><![CDATA[Thank you very much for the comment on our paper. We are continuously trying to update our project website to provide all information about Ago HITS-CLIP map. Please keep on eye on our website (http://ago.rockefeller.edu) for updated information.

Regarding the issue raised by Noah above, I’d like to make it clear that we did not “cherry pick” our data.  In the paper, we tried to clarify our criteria and motivations regarding our choice of using criteria for biologic complexity (BC) and peak height in the paper as follows:
“Relative to more stringent analyses (Fig. 2c), our analysis at this 
threshold (BC&gt;=2) was more sensitive and sufficiently specific such that we 
used it for subsequent analyses (Supplementary Fig. 7)&quot;

Part of the difficulty is the severe space constraint put upon us by the journal, such that we struggled to balance general descriptions of the work with sufficient detail necessary to make it rigorous (with a necessarily large amount of supplementary information).  What this sentence is meant to clarify is our general strategy:  to initially analyze the data using a stringent set of criteria, and then generalize it using a larger dataset.  Hence what appeared as “cherry picking” to you was rationally based.

We originally developed an empirical approach to BC and peak height based on the validation experiments.  In Figure 2, we set out to develop a new method ab initio.  We begin with the most stringent and trustable data sets to figure out the distributional property of tags and cluster width relative to peak position to define a conservative Ago footprint region. In order to get accurate high resolution of peak position, which is interpolated by cubic spline, we need to have quite good number of tags in the clusters with single peaks. As the length of tags is 36nt (maximum reads of solexa sequencing is 36), peak heights more than 30 are needed to define single nucleotide resolution of the peak position (in the extreme case, positioning 6mer seed interacting sites in 36nt tags needs 30 different unique tags with single nucleotide resolution). So we used the clusters with BC5 and peak height &gt; 30 (Fig. 2A, we thought that it is quite intuitive without detailed explanation. We should have put it in supplementary information for the people who couldn’t catch this intuitively.)

Peak height cutoff (&gt;20) in supplementary fig 7 is the cutoff for accuracy of Ago binding based on the comparisons of cluster number in different peak height threshold with different BC (because the distribution of different BC begin to merge around this threshold). So we used &gt;30, which is more stringent than 20 (and likely to be more real; that’s why we refer to supplementary fig 7) and could be used to accurately define peak position in high resolution, which is essential to define the Ago footprint region (Fig2A)

We then generalize to the data from the more stringent dataset (BC5) to the more general dataset (BC2; Fig 2A-&gt;2B).  A similar pattern was done with the analysis of conserved seed sequences in Ago footprints; our initial discovery strategy was with stringent conditions (Fig. 2C:  BC5, threshold 30), and we then took a more general approach; 2C-&gt;2D, where in Fig. 2D,BC &gt; 2, no restriction on peak height).  Given our promising results in Fig. 2B and 2D, we then chose to use these criteria (BC2, no peak height restriction), for rest of paper (we also used BC&gt;=2 for estimation of false positive. We now realize that it is typo written as BC&gt;2, the equals sign disappeared somehow in editing.)

Regarding analysis of top30 seeds vs bottom30 seeds, if you look at Supplementary figure 13C, the false positive rate is quite good up to 20. That’s why we used the top 20 miRNA seeds for generating Ago ternary map giving high accuracy. But the purpose of fig2C is for estimating how good Ago footprint is for defining miRNA binding sites by comparing not only the number of seeds but also their distribution relative to peaks (Top30 seeds vs bottom 30 seeds). For the comparison of two distribution fairly (by kurtosis), seeds from control set (bottom 30) should have some good amount of conserved seed sites. We found very few sites from the bottom 20, which is not enough for statistics to compare the distribution. So we put Top30 vs Bottom30 for this comparison although the result is not better than Top20 vs Bottom20 (Fig2C and Supplementary figure 13C). For estimating false positive rates, we used Top20 vs Bottom20, which is our final criteria for Ago ternary map, but we also explained the false positive rates, which could be estimated by figure 2C to give further information for the reader (Also in Supplementary table3 and Supplementary figure 13).

Bob &amp; Chi]]></description>
		<content:encoded><![CDATA[<p>Thank you very much for the comment on our paper. We are continuously trying to update our project website to provide all information about Ago HITS-CLIP map. Please keep on eye on our website (<a href="http://ago.rockefeller.edu" rel="nofollow">http://ago.rockefeller.edu</a>) for updated information.</p>
<p>Regarding the issue raised by Noah above, I’d like to make it clear that we did not “cherry pick” our data.  In the paper, we tried to clarify our criteria and motivations regarding our choice of using criteria for biologic complexity (BC) and peak height in the paper as follows:<br />
“Relative to more stringent analyses (Fig. 2c), our analysis at this<br />
threshold (BC&gt;=2) was more sensitive and sufficiently specific such that we<br />
used it for subsequent analyses (Supplementary Fig. 7)&#8221;</p>
<p>Part of the difficulty is the severe space constraint put upon us by the journal, such that we struggled to balance general descriptions of the work with sufficient detail necessary to make it rigorous (with a necessarily large amount of supplementary information).  What this sentence is meant to clarify is our general strategy:  to initially analyze the data using a stringent set of criteria, and then generalize it using a larger dataset.  Hence what appeared as “cherry picking” to you was rationally based.</p>
<p>We originally developed an empirical approach to BC and peak height based on the validation experiments.  In Figure 2, we set out to develop a new method ab initio.  We begin with the most stringent and trustable data sets to figure out the distributional property of tags and cluster width relative to peak position to define a conservative Ago footprint region. In order to get accurate high resolution of peak position, which is interpolated by cubic spline, we need to have quite good number of tags in the clusters with single peaks. As the length of tags is 36nt (maximum reads of solexa sequencing is 36), peak heights more than 30 are needed to define single nucleotide resolution of the peak position (in the extreme case, positioning 6mer seed interacting sites in 36nt tags needs 30 different unique tags with single nucleotide resolution). So we used the clusters with BC5 and peak height &gt; 30 (Fig. 2A, we thought that it is quite intuitive without detailed explanation. We should have put it in supplementary information for the people who couldn’t catch this intuitively.)</p>
<p>Peak height cutoff (&gt;20) in supplementary fig 7 is the cutoff for accuracy of Ago binding based on the comparisons of cluster number in different peak height threshold with different BC (because the distribution of different BC begin to merge around this threshold). So we used &gt;30, which is more stringent than 20 (and likely to be more real; that’s why we refer to supplementary fig 7) and could be used to accurately define peak position in high resolution, which is essential to define the Ago footprint region (Fig2A)</p>
<p>We then generalize to the data from the more stringent dataset (BC5) to the more general dataset (BC2; Fig 2A-&gt;2B).  A similar pattern was done with the analysis of conserved seed sequences in Ago footprints; our initial discovery strategy was with stringent conditions (Fig. 2C:  BC5, threshold 30), and we then took a more general approach; 2C-&gt;2D, where in Fig. 2D,BC &gt; 2, no restriction on peak height).  Given our promising results in Fig. 2B and 2D, we then chose to use these criteria (BC2, no peak height restriction), for rest of paper (we also used BC&gt;=2 for estimation of false positive. We now realize that it is typo written as BC&gt;2, the equals sign disappeared somehow in editing.)</p>
<p>Regarding analysis of top30 seeds vs bottom30 seeds, if you look at Supplementary figure 13C, the false positive rate is quite good up to 20. That’s why we used the top 20 miRNA seeds for generating Ago ternary map giving high accuracy. But the purpose of fig2C is for estimating how good Ago footprint is for defining miRNA binding sites by comparing not only the number of seeds but also their distribution relative to peaks (Top30 seeds vs bottom 30 seeds). For the comparison of two distribution fairly (by kurtosis), seeds from control set (bottom 30) should have some good amount of conserved seed sites. We found very few sites from the bottom 20, which is not enough for statistics to compare the distribution. So we put Top30 vs Bottom30 for this comparison although the result is not better than Top20 vs Bottom20 (Fig2C and Supplementary figure 13C). For estimating false positive rates, we used Top20 vs Bottom20, which is our final criteria for Ago ternary map, but we also explained the false positive rates, which could be estimated by figure 2C to give further information for the reader (Also in Supplementary table3 and Supplementary figure 13).</p>
<p>Bob &amp; Chi</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Noah</title>
		<link>http://youdpreferanargonaute.com/2009/06/25/rna-journal-club-62509/#comment-178</link>
		<dc:creator><![CDATA[Noah]]></dc:creator>
		<pubDate>Fri, 24 Jul 2009 20:16:57 +0000</pubDate>
		<guid isPermaLink="false">http://youdpreferanargonaute.com/?p=385#comment-178</guid>
		<description><![CDATA[Thanks for your response, Dr Darnell. I&#039;m glad that the raw data are now available, and completely understand that getting such things together can take a little longer. I&#039;ll have our blog editor post an update linking to the data. I&#039;m sure my colleagues will be excited to have a look at the data for themselves.

My problem with cherry-picking data actually comes primarily from figures 1 &amp; 2. For example, figures 1h and 2c display clusters only found in all 5 replicates, whereas some subsequent analyses involve all clusters found in at least two experiments (eg Fig 5) or at least 3 experiments (eg estimate of false negative rate; this one may just be a typo). Or for the calculation of false positive rate, the top &amp; bottom 30 seeds were used for one analysis whereas the top &amp; bottom 20 seeds were used for another analysis. Or the analysis that claims that a peak heigh cutoff of at least 20 is good, yet figure 2 uses a different peak height cutoff of at least 30. These individually are minor inconsistencies, but come without good explanation and at the very least make it difficult to compare analyses within the paper.]]></description>
		<content:encoded><![CDATA[<p>Thanks for your response, Dr Darnell. I&#8217;m glad that the raw data are now available, and completely understand that getting such things together can take a little longer. I&#8217;ll have our blog editor post an update linking to the data. I&#8217;m sure my colleagues will be excited to have a look at the data for themselves.</p>
<p>My problem with cherry-picking data actually comes primarily from figures 1 &amp; 2. For example, figures 1h and 2c display clusters only found in all 5 replicates, whereas some subsequent analyses involve all clusters found in at least two experiments (eg Fig 5) or at least 3 experiments (eg estimate of false negative rate; this one may just be a typo). Or for the calculation of false positive rate, the top &amp; bottom 30 seeds were used for one analysis whereas the top &amp; bottom 20 seeds were used for another analysis. Or the analysis that claims that a peak heigh cutoff of at least 20 is good, yet figure 2 uses a different peak height cutoff of at least 30. These individually are minor inconsistencies, but come without good explanation and at the very least make it difficult to compare analyses within the paper.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Bob Darnell</title>
		<link>http://youdpreferanargonaute.com/2009/06/25/rna-journal-club-62509/#comment-176</link>
		<dc:creator><![CDATA[Bob Darnell]]></dc:creator>
		<pubDate>Thu, 23 Jul 2009 10:20:43 +0000</pubDate>
		<guid isPermaLink="false">http://youdpreferanargonaute.com/?p=385#comment-176</guid>
		<description><![CDATA[Thanks for your great interest in the Ago HITS-CLIP map that we developed.  All of our Ago HITS-CLIP raw data and  UCSC links will be released on the formal publication date (today), July 23.  We will continue to maintain updates on our project website.

 I can understand that going through all the Supplementary data adds a big burden to the review--it is 70 pages, but this might have helped your review a bit.  For example, re: the concern that we randomly cherry-picked data for our figure panel (I assume you mean Fig 3), what in fact we show are two examples where miR-124 has been rigorously shown by bioinformatic (as done so nicely by your boss Dave Bartel) and mutagenesis studies to regulate expression.  There are not so many such examples.  We illustrate the maps for all the ones we could find in the literature, but couldn&#039;t fit it in Fig 3 for space reasons--all the others are in Supplementary Figure 10 (which contains 10 subfigures).]]></description>
		<content:encoded><![CDATA[<p>Thanks for your great interest in the Ago HITS-CLIP map that we developed.  All of our Ago HITS-CLIP raw data and  UCSC links will be released on the formal publication date (today), July 23.  We will continue to maintain updates on our project website.</p>
<p> I can understand that going through all the Supplementary data adds a big burden to the review&#8211;it is 70 pages, but this might have helped your review a bit.  For example, re: the concern that we randomly cherry-picked data for our figure panel (I assume you mean Fig 3), what in fact we show are two examples where miR-124 has been rigorously shown by bioinformatic (as done so nicely by your boss Dave Bartel) and mutagenesis studies to regulate expression.  There are not so many such examples.  We illustrate the maps for all the ones we could find in the literature, but couldn&#8217;t fit it in Fig 3 for space reasons&#8211;all the others are in Supplementary Figure 10 (which contains 10 subfigures).</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Hawt RNA Blogs &#171; You&#8217;d Prefer An Argonaute</title>
		<link>http://youdpreferanargonaute.com/2009/06/25/rna-journal-club-62509/#comment-118</link>
		<dc:creator><![CDATA[Hawt RNA Blogs &#171; You&#8217;d Prefer An Argonaute]]></dc:creator>
		<pubDate>Mon, 06 Jul 2009 22:41:17 +0000</pubDate>
		<guid isPermaLink="false">http://youdpreferanargonaute.com/?p=385#comment-118</guid>
		<description><![CDATA[[...] the topic far more sensorial. My blog has recently spotlighted literature with titles like, &#8220;Argonaute HITS-CLIP decodes microRNA–mRNA interaction maps.&#8221; The other RNA blog,  &#8220;Surrender to the Playboy Sheikh&#8221;, and &#8220;Disrobbed [...]]]></description>
		<content:encoded><![CDATA[<p>[...] the topic far more sensorial. My blog has recently spotlighted literature with titles like, &#8220;Argonaute HITS-CLIP decodes microRNA–mRNA interaction maps.&#8221; The other RNA blog,  &#8220;Surrender to the Playboy Sheikh&#8221;, and &#8220;Disrobbed [...]</p>
]]></content:encoded>
	</item>
</channel>
</rss>

