http://info.gersteinlab.org/index.php?title=Special:Contributions&feed=atom&target=TaraGersteinInfo - User contributions [en]2021-12-01T04:42:35ZFrom GersteinInfoMediaWiki 1.15.4http://info.gersteinlab.org/CRIT/workflowCRIT/workflow2011-02-21T15:50:51Z<p>Tara: /* Example Code */</p>
<hr />
<div>==Transcription Factor Example==<br />
<br />
===Motivation and Problem Set Up===<br />
<br />
Cis regulatory elements as a means of regulating gene expression have<br />
been extensively studied. However, beyond such motifs, are their<br />
inherent properties of the targets themselves that make them more or<br />
less likely to be regulated by a given class of transcription factors?<br />
As an example, do essential transcription factors preferentially<br />
regulate essential targets? Are there genome composition features<br />
such as GC or codon bias that influence which targets are regulated by<br />
which TFs?<br />
<br />
===Input Data===<br />
Here, we use three different datasets as shown.<br />
<br />
[[File:schema.png|200 px| thumb | left| Data Input Set up]]<br />
<br />
These objects are named as follows in the R dataset:<br />
<br />
(1) T: Transcription factors and their associated properties<br />
<br />
(2) C: Connector Matrix matching transcription factors to their associated targets<br />
<br />
(3) G: Gene targets and their associated properties<br />
<br />
T and G are both post processed from:<br />
<br />
Y. Xia, E. A. Franzosa, and M. B. Gerstein. Integrated assessment of genomic correlates of protein <br />
evolutionary rate. PLoS Comput Biol, 5(6):e1000413–e1000413, 2009.<br />
<br />
C is post processed from:<br />
<br />
C. T. Harbison, et al. Transcriptional <br />
regulatory code of a eukaryotic genome. Nature, 431(7004):99–104, 2004.<br />
<br />
As in Harbison et al, p<.001 was used to indicate a TF-gene target. We binarized the matrix such that any TF-gene pair with a pval<.001 had a 1 and anything greater than this had a 0.<br />
<br />
===Example Code===<br />
<br />
<br />
<pre><br />
<br />
#The below code snippet needs to be altered to contain the appropriate file path for both the dataset and CRIT package.<br />
#Load Data<br />
load(file="TFExample.RData")<br />
<br />
#Load CRIT functions<br />
source(file="CRIT.R")<br />
<br />
#Generate label for feature of interest <br />
#Specify x for column variable of interest<br />
tLabel<-initializer(T[,x], type="median")<br />
<br />
#Determine set of targets sensitive to this feature<br />
#This step can be quite slow, be patient (on average about 2-3 minutes)<br />
DC<-discriminator(C, tLabel, multCorrect=TRUE)<br />
<br />
#Generate new label based on sensitivity identified in previous step<br />
gLabel<-labelSlicer(DC, .05)<br />
<br />
#Identify features that seem to discriminate between sens/insens targets<br />
DG<-discriminator(t(G), gLabel, multCorrect=TRUE)<br />
</pre><br />
<br />
===Output===<br />
<br />
Cross Patterns have a natural X relationship Y representation making a network representation an ideal way to visualize results.<br />
<br />
Cross patterns can easily be formatted in .sif for loading into various network browsers including<br />
<br />
[http://tyna.gersteinlab.org/tyna/ tYNA] or [http://www.cytoscape.org/ Cytoscape].<br />
<br />
[[File:results.png]]</div>Tarahttp://info.gersteinlab.org/CRIT/workflowCRIT/workflow2011-02-21T15:49:37Z<p>Tara: /* Example Code */</p>
<hr />
<div>==Transcription Factor Example==<br />
<br />
===Motivation and Problem Set Up===<br />
<br />
Cis regulatory elements as a means of regulating gene expression have<br />
been extensively studied. However, beyond such motifs, are their<br />
inherent properties of the targets themselves that make them more or<br />
less likely to be regulated by a given class of transcription factors?<br />
As an example, do essential transcription factors preferentially<br />
regulate essential targets? Are there genome composition features<br />
such as GC or codon bias that influence which targets are regulated by<br />
which TFs?<br />
<br />
===Input Data===<br />
Here, we use three different datasets as shown.<br />
<br />
[[File:schema.png|200 px| thumb | left| Data Input Set up]]<br />
<br />
These objects are named as follows in the R dataset:<br />
<br />
(1) T: Transcription factors and their associated properties<br />
<br />
(2) C: Connector Matrix matching transcription factors to their associated targets<br />
<br />
(3) G: Gene targets and their associated properties<br />
<br />
T and G are both post processed from:<br />
<br />
Y. Xia, E. A. Franzosa, and M. B. Gerstein. Integrated assessment of genomic correlates of protein <br />
evolutionary rate. PLoS Comput Biol, 5(6):e1000413–e1000413, 2009.<br />
<br />
C is post processed from:<br />
<br />
C. T. Harbison, et al. Transcriptional <br />
regulatory code of a eukaryotic genome. Nature, 431(7004):99–104, 2004.<br />
<br />
As in Harbison et al, p<.001 was used to indicate a TF-gene target. We binarized the matrix such that any TF-gene pair with a pval<.001 had a 1 and anything greater than this had a 0.<br />
<br />
===Example Code===<br />
<br />
<br />
<pre><br />
<br />
#The below code snippet needs to be altered to contain the appropriate file path for both the dataset and CRIT package.<br />
#Load Data<br />
load(file="TFExample.RData")<br />
<br />
#Load CRIT functions<br />
source(file="CRIT.R")<br />
<br />
#Generate label for feature of interest <br />
#Specify x for column variable of interest<br />
tLabel<-initializer(T[,x], type="median")<br />
<br />
#Determine set of targets sensitive to this feature<br />
#This step can be quite slow, be patient<br />
DC<-discriminator(C, tLabel, multCorrect=TRUE)<br />
<br />
#Generate new label based on sensitivity identified in previous step<br />
gLabel<-labelSlicer(DC, .05)<br />
<br />
#Identify features that seem to discriminate between sens/insens targets<br />
DG<-discriminator(t(G), gLabel, multCorrect=TRUE)<br />
</pre><br />
<br />
===Output===<br />
<br />
Cross Patterns have a natural X relationship Y representation making a network representation an ideal way to visualize results.<br />
<br />
Cross patterns can easily be formatted in .sif for loading into various network browsers including<br />
<br />
[http://tyna.gersteinlab.org/tyna/ tYNA] or [http://www.cytoscape.org/ Cytoscape].<br />
<br />
[[File:results.png]]</div>Tarahttp://info.gersteinlab.org/CRIT/workflowCRIT/workflow2011-02-21T15:21:26Z<p>Tara: /* Example Code */</p>
<hr />
<div>==Transcription Factor Example==<br />
<br />
===Motivation and Problem Set Up===<br />
<br />
Cis regulatory elements as a means of regulating gene expression have<br />
been extensively studied. However, beyond such motifs, are their<br />
inherent properties of the targets themselves that make them more or<br />
less likely to be regulated by a given class of transcription factors?<br />
As an example, do essential transcription factors preferentially<br />
regulate essential targets? Are there genome composition features<br />
such as GC or codon bias that influence which targets are regulated by<br />
which TFs?<br />
<br />
===Input Data===<br />
Here, we use three different datasets as shown.<br />
<br />
[[File:schema.png|200 px| thumb | left| Data Input Set up]]<br />
<br />
These objects are named as follows in the R dataset:<br />
<br />
(1) T: Transcription factors and their associated properties<br />
<br />
(2) C: Connector Matrix matching transcription factors to their associated targets<br />
<br />
(3) G: Gene targets and their associated properties<br />
<br />
T and G are both post processed from:<br />
<br />
Y. Xia, E. A. Franzosa, and M. B. Gerstein. Integrated assessment of genomic correlates of protein <br />
evolutionary rate. PLoS Comput Biol, 5(6):e1000413–e1000413, 2009.<br />
<br />
C is post processed from:<br />
<br />
C. T. Harbison, et al. Transcriptional <br />
regulatory code of a eukaryotic genome. Nature, 431(7004):99–104, 2004.<br />
<br />
As in Harbison et al, p<.001 was used to indicate a TF-gene target. We binarized the matrix such that any TF-gene pair with a pval<.001 had a 1 and anything greater than this had a 0.<br />
<br />
===Example Code===<br />
<br />
<br />
<pre><br />
<br />
#The below code snippet needs to be altered to contain the appropriate file path for both the dataset and CRIT package.<br />
#Load Data<br />
load(file="TFExample.RData")<br />
<br />
#Load CRIT functions<br />
source(file="CRIT.R")<br />
<br />
#Generate label for feature of interest <br />
#Specify x for column variable of interest<br />
tLabel<-initializer(T[,x], type="median")<br />
<br />
#Determine set of targets sensitive to this feature<br />
DC<-discriminator(C, tLabel, multCorrect=TRUE)<br />
<br />
#Generate new label based on sensitivity identified in previous step<br />
gLabel<-labelSlicer(DC, .05)<br />
<br />
#Identify features that seem to discriminate between sens/insens targets<br />
DG<-discriminator(t(G), gLabel, multCorrect=TRUE)<br />
</pre><br />
<br />
===Output===<br />
<br />
Cross Patterns have a natural X relationship Y representation making a network representation an ideal way to visualize results.<br />
<br />
Cross patterns can easily be formatted in .sif for loading into various network browsers including<br />
<br />
[http://tyna.gersteinlab.org/tyna/ tYNA] or [http://www.cytoscape.org/ Cytoscape].<br />
<br />
[[File:results.png]]</div>Tarahttp://info.gersteinlab.org/CRIT/workflowCRIT/workflow2011-02-21T15:20:05Z<p>Tara: /* Example Code */</p>
<hr />
<div>==Transcription Factor Example==<br />
<br />
===Motivation and Problem Set Up===<br />
<br />
Cis regulatory elements as a means of regulating gene expression have<br />
been extensively studied. However, beyond such motifs, are their<br />
inherent properties of the targets themselves that make them more or<br />
less likely to be regulated by a given class of transcription factors?<br />
As an example, do essential transcription factors preferentially<br />
regulate essential targets? Are there genome composition features<br />
such as GC or codon bias that influence which targets are regulated by<br />
which TFs?<br />
<br />
===Input Data===<br />
Here, we use three different datasets as shown.<br />
<br />
[[File:schema.png|200 px| thumb | left| Data Input Set up]]<br />
<br />
These objects are named as follows in the R dataset:<br />
<br />
(1) T: Transcription factors and their associated properties<br />
<br />
(2) C: Connector Matrix matching transcription factors to their associated targets<br />
<br />
(3) G: Gene targets and their associated properties<br />
<br />
T and G are both post processed from:<br />
<br />
Y. Xia, E. A. Franzosa, and M. B. Gerstein. Integrated assessment of genomic correlates of protein <br />
evolutionary rate. PLoS Comput Biol, 5(6):e1000413–e1000413, 2009.<br />
<br />
C is post processed from:<br />
<br />
C. T. Harbison, et al. Transcriptional <br />
regulatory code of a eukaryotic genome. Nature, 431(7004):99–104, 2004.<br />
<br />
As in Harbison et al, p<.001 was used to indicate a TF-gene target. We binarized the matrix such that any TF-gene pair with a pval<.001 had a 1 and anything greater than this had a 0.<br />
<br />
===Example Code===<br />
<br />
<br />
<pre><br />
<br />
#The below code snippet needs to be altered to contain the appropriate file path for both the dataset and CRIT package.<br />
#Load Data<br />
load(file="TFExample.RData")<br />
<br />
#Load CRIT functions<br />
source(file="CRIT.R")<br />
<br />
#Generate label for feature of interest - set x for column variable<br />
tLabel<-initializer(T[,x], type="median")<br />
<br />
#Determine set of targets sensitive to this feature<br />
DC<-discriminator(C, tLabel, multCorrect=TRUE)<br />
<br />
#Generate new label based on sensitivity identified in previous step<br />
gLabel<-labelSlicer(DC, .05)<br />
<br />
#Identify features that seem to discriminate between sens/insens targets<br />
DG<-discriminator(t(G), gLabel, multCorrect=TRUE)<br />
</pre><br />
<br />
===Output===<br />
<br />
Cross Patterns have a natural X relationship Y representation making a network representation an ideal way to visualize results.<br />
<br />
Cross patterns can easily be formatted in .sif for loading into various network browsers including<br />
<br />
[http://tyna.gersteinlab.org/tyna/ tYNA] or [http://www.cytoscape.org/ Cytoscape].<br />
<br />
[[File:results.png]]</div>Tarahttp://info.gersteinlab.org/CRIT/workflowCRIT/workflow2011-02-21T15:12:12Z<p>Tara: /* Example Code */</p>
<hr />
<div>==Transcription Factor Example==<br />
<br />
===Motivation and Problem Set Up===<br />
<br />
Cis regulatory elements as a means of regulating gene expression have<br />
been extensively studied. However, beyond such motifs, are their<br />
inherent properties of the targets themselves that make them more or<br />
less likely to be regulated by a given class of transcription factors?<br />
As an example, do essential transcription factors preferentially<br />
regulate essential targets? Are there genome composition features<br />
such as GC or codon bias that influence which targets are regulated by<br />
which TFs?<br />
<br />
===Input Data===<br />
Here, we use three different datasets as shown.<br />
<br />
[[File:schema.png|200 px| thumb | left| Data Input Set up]]<br />
<br />
These objects are named as follows in the R dataset:<br />
<br />
(1) T: Transcription factors and their associated properties<br />
<br />
(2) C: Connector Matrix matching transcription factors to their associated targets<br />
<br />
(3) G: Gene targets and their associated properties<br />
<br />
T and G are both post processed from:<br />
<br />
Y. Xia, E. A. Franzosa, and M. B. Gerstein. Integrated assessment of genomic correlates of protein <br />
evolutionary rate. PLoS Comput Biol, 5(6):e1000413–e1000413, 2009.<br />
<br />
C is post processed from:<br />
<br />
C. T. Harbison, et al. Transcriptional <br />
regulatory code of a eukaryotic genome. Nature, 431(7004):99–104, 2004.<br />
<br />
As in Harbison et al, p<.001 was used to indicate a TF-gene target. We binarized the matrix such that any TF-gene pair with a pval<.001 had a 1 and anything greater than this had a 0.<br />
<br />
===Example Code===<br />
<br />
<br />
<pre><br />
<br />
#Load Data<br />
data(TFExample.RData)<br />
<br />
#Load CRIT functions<br />
library(CRIT.R)<br />
<br />
#Generate label for feature of interest - set x for column variable<br />
tLabel<-initializer(T[,x], type="median")<br />
<br />
#Determine set of targets sensitive to this feature<br />
DC<-discriminator(C, tLabel, multCorrect=TRUE)<br />
<br />
#Generate new label based on sensitivity identified in previous step<br />
gLabel<-labelSlicer(DC, .05)<br />
<br />
#Identify features that seem to discriminate between sens/insens targets<br />
DG<-discriminator(t(G), gLabel, multCorrect=TRUE)<br />
</pre><br />
<br />
===Output===<br />
<br />
Cross Patterns have a natural X relationship Y representation making a network representation an ideal way to visualize results.<br />
<br />
Cross patterns can easily be formatted in .sif for loading into various network browsers including<br />
<br />
[http://tyna.gersteinlab.org/tyna/ tYNA] or [http://www.cytoscape.org/ Cytoscape].<br />
<br />
[[File:results.png]]</div>Tarahttp://info.gersteinlab.org/CRIT/workflowCRIT/workflow2011-02-21T15:10:09Z<p>Tara: /* Motivation and Problem Set Up */</p>
<hr />
<div>==Transcription Factor Example==<br />
<br />
===Motivation and Problem Set Up===<br />
<br />
Cis regulatory elements as a means of regulating gene expression have<br />
been extensively studied. However, beyond such motifs, are their<br />
inherent properties of the targets themselves that make them more or<br />
less likely to be regulated by a given class of transcription factors?<br />
As an example, do essential transcription factors preferentially<br />
regulate essential targets? Are there genome composition features<br />
such as GC or codon bias that influence which targets are regulated by<br />
which TFs?<br />
<br />
===Input Data===<br />
Here, we use three different datasets as shown.<br />
<br />
[[File:schema.png|200 px| thumb | left| Data Input Set up]]<br />
<br />
These objects are named as follows in the R dataset:<br />
<br />
(1) T: Transcription factors and their associated properties<br />
<br />
(2) C: Connector Matrix matching transcription factors to their associated targets<br />
<br />
(3) G: Gene targets and their associated properties<br />
<br />
T and G are both post processed from:<br />
<br />
Y. Xia, E. A. Franzosa, and M. B. Gerstein. Integrated assessment of genomic correlates of protein <br />
evolutionary rate. PLoS Comput Biol, 5(6):e1000413–e1000413, 2009.<br />
<br />
C is post processed from:<br />
<br />
C. T. Harbison, et al. Transcriptional <br />
regulatory code of a eukaryotic genome. Nature, 431(7004):99–104, 2004.<br />
<br />
As in Harbison et al, p<.001 was used to indicate a TF-gene target. We binarized the matrix such that any TF-gene pair with a pval<.001 had a 1 and anything greater than this had a 0.<br />
<br />
===Example Code===<br />
<br />
<br />
<pre><br />
<br />
#Load Data<br />
load(file="TFExample.RData")<br />
<br />
#Load CRIT functions<br />
source(file="CRIT.R")<br />
<br />
#Generate label for feature of interest - set x for column variable<br />
tLabel<-initializer(T[,x], type="median")<br />
<br />
#Determine set of targets sensitive to this feature<br />
DC<-discriminator(C, tLabel, multCorrect=TRUE)<br />
<br />
#Generate new label based on sensitivity identified in previous step<br />
gLabel<-labelSlicer(DC, .05)<br />
<br />
#Identify features that seem to discriminate between sens/insens targets<br />
DG<-discriminator(t(G), gLabel, multCorrect=TRUE)<br />
</pre><br />
<br />
===Output===<br />
<br />
Cross Patterns have a natural X relationship Y representation making a network representation an ideal way to visualize results.<br />
<br />
Cross patterns can easily be formatted in .sif for loading into various network browsers including<br />
<br />
[http://tyna.gersteinlab.org/tyna/ tYNA] or [http://www.cytoscape.org/ Cytoscape].<br />
<br />
[[File:results.png]]</div>Tarahttp://info.gersteinlab.org/CRIT/codeCRIT/code2011-02-21T13:09:40Z<p>Tara: /* License information */</p>
<hr />
<div><center>[http://archive.gersteinlab.org/proj/crit/ '''CRIT Main Page''']</center><br />
<br />
__NOTOC__<br />
<br />
== Code ==<br />
<br />
=== Required Software - External ===<br />
<br />
# This is an R package.<br />
<br />
<br><br />
<br />
=== Download ===<br />
<br />
==== Source code ====<br />
A TAR ball that contains the source code for these two components can be downloaded here:<br />
* [http://archive.gersteinlab.org/proj/crit/R/ CRIT_1.0.tar.gz]<br />
<br />
<pre><br />
Important Note<br />
==============<br />
Copyright (c) 2011 Tara Gianoulis<br />
<br />
Permission is hereby granted, free of charge, to any person obtaining a copy<br />
of this software and associated documentation files (the "Software"), to deal<br />
in the Software without restriction, including without limitation the rights<br />
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell<br />
copies of the Software, and to permit persons to whom the Software is<br />
furnished to do so, subject to the following conditions:<br />
<br />
The above copyright notice and this permission notice shall be included in<br />
all copies or substantial portions of the Software.<br />
<br />
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR<br />
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,<br />
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE<br />
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER<br />
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,<br />
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN<br />
THE SOFTWARE.<br />
</pre><br />
<br />
<br></div>Tarahttp://info.gersteinlab.org/CRIT/codeCRIT/code2011-02-21T13:09:25Z<p>Tara: /* Source code */</p>
<hr />
<div><center>[http://archive.gersteinlab.org/proj/crit/ '''CRIT Main Page''']</center><br />
<br />
__NOTOC__<br />
<br />
== Code ==<br />
<br />
=== Required Software - External ===<br />
<br />
# This is an R package.<br />
<br />
<br><br />
<br />
=== Download ===<br />
<br />
==== Source code ====<br />
A TAR ball that contains the source code for these two components can be downloaded here:<br />
* [http://archive.gersteinlab.org/proj/crit/R/ CRIT_1.0.tar.gz]<br />
<br />
<pre><br />
Important Note<br />
==============<br />
Copyright (c) 2011 Tara Gianoulis<br />
<br />
Permission is hereby granted, free of charge, to any person obtaining a copy<br />
of this software and associated documentation files (the "Software"), to deal<br />
in the Software without restriction, including without limitation the rights<br />
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell<br />
copies of the Software, and to permit persons to whom the Software is<br />
furnished to do so, subject to the following conditions:<br />
<br />
The above copyright notice and this permission notice shall be included in<br />
all copies or substantial portions of the Software.<br />
<br />
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR<br />
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,<br />
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE<br />
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER<br />
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,<br />
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN<br />
THE SOFTWARE.<br />
</pre><br />
<br />
<br><br />
<br />
=== License information ===<br />
<br />
The software package is released under the [http://creativecommons.org/licenses/by-nc/2.5/legalcode Creative Commons license (Attribution-NonCommerical)]. <br><br />
For more details please refer to the [http://www.gersteinlab.org/misc/permissions.html Permissions Page] on the Gerstein Lab webpage.</div>Tarahttp://info.gersteinlab.org/CRIT/codeCRIT/code2011-01-30T22:44:59Z<p>Tara: </p>
<hr />
<div><center>[http://archive.gersteinlab.org/proj/crit/ '''CRIT Main Page''']</center><br />
<br />
__NOTOC__<br />
<br />
== Code ==<br />
<br />
=== Required Software - External ===<br />
<br />
# This is an R package.<br />
<br />
<br><br />
<br />
=== Download ===<br />
<br />
==== Source code ====<br />
A TAR ball that contains the source code for these two components can be downloaded here:<br />
* [http://archive.gersteinlab.org/proj/crit/R/ CRIT_1.0.tar.gz]<br />
<br />
<pre><br />
Important Note<br />
==============<br />
<br />
THIS PACKAGE CRIT IS PROVIDED "AS IS" AND WITHOUT ANY EXPRESSED OR IMPLIED<br />
WARRANTIES, INCLUDING, WITHOUT LIMITATION, THE IMPLIED WARRANTIES<br />
OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE.<br />
</pre><br />
<br />
<br><br />
<br />
=== License information ===<br />
<br />
The software package is released under the [http://creativecommons.org/licenses/by-nc/2.5/legalcode Creative Commons license (Attribution-NonCommerical)]. <br><br />
For more details please refer to the [http://www.gersteinlab.org/misc/permissions.html Permissions Page] on the Gerstein Lab webpage.</div>Tarahttp://info.gersteinlab.org/CRIT/codeCRIT/code2011-01-30T21:39:26Z<p>Tara: /* Source code */</p>
<hr />
<div><center>[http://archive.gersteinlab.org/proj/crit/ '''CRIT Main Page''']</center><br />
<br />
__NOTOC__<br />
<br />
== Code ==<br />
<br />
=== Required Software - External ===<br />
<br />
# This is an R package.<br />
<br />
<br><br />
<br />
=== Download ===<br />
<br />
==== Source code ====<br />
A TAR ball that contains the source code for these two components can be downloaded here:<br />
* [http://archive.gersteinlab.org/proj/crit CRIT_1.0.tar.gz]<br />
<br />
<pre><br />
Important Note<br />
==============<br />
<br />
THIS PACKAGE CRIT IS PROVIDED "AS IS" AND WITHOUT ANY EXPRESSED OR IMPLIED<br />
WARRANTIES, INCLUDING, WITHOUT LIMITATION, THE IMPLIED WARRANTIES<br />
OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE.<br />
</pre><br />
<br />
<br><br />
<br />
=== License information ===<br />
<br />
The software package is released under the [http://creativecommons.org/licenses/by-nc/2.5/legalcode Creative Commons license (Attribution-NonCommerical)]. <br><br />
For more details please refer to the [http://www.gersteinlab.org/misc/permissions.html Permissions Page] on the Gerstein Lab webpage.</div>Tarahttp://info.gersteinlab.org/CRIT/workflowCRIT/workflow2011-01-30T21:21:15Z<p>Tara: /* Output */</p>
<hr />
<div>==Transcription Factor Example==<br />
<br />
===Motivation and Problem Set Up===<br />
<br />
Cis regulatory elements as a means of regulating gene expression have<br />
been extensively studied. However, beyond such motifs, are there<br />
inherent properties of the targets themselves that make them more or<br />
less likely to be regulated by a given class of transcription factors?<br />
As an example, do essential transcription factors preferentially<br />
regulate essential targets? Are there genome composition features<br />
such as GC or codon bias that influence which targets are regulated by<br />
which TFs? <br />
<br />
===Input Data===<br />
Here, we use three different datasets as shown.<br />
<br />
[[File:schema.png|200 px| thumb | left| Data Input Set up]]<br />
<br />
These objects are named as follows in the R dataset:<br />
<br />
(1) T: Transcription factors and their associated properties<br />
<br />
(2) C: Connector Matrix matching transcription factors to their associated targets<br />
<br />
(3) G: Gene targets and their associated properties<br />
<br />
T and G are both post processed from:<br />
<br />
Y. Xia, E. A. Franzosa, and M. B. Gerstein. Integrated assessment of genomic correlates of protein <br />
evolutionary rate. PLoS Comput Biol, 5(6):e1000413–e1000413, 2009.<br />
<br />
C is post processed from:<br />
<br />
C. T. Harbison, et al. Transcriptional <br />
regulatory code of a eukaryotic genome. Nature, 431(7004):99–104, 2004.<br />
<br />
As in Harbison et al, p<.001 was used to indicate a TF-gene target. We binarized the matrix such that any TF-gene pair with a pval<.001 had a 1 and anything greater than this had a 0.<br />
<br />
===Example Code===<br />
<br />
<br />
<pre><br />
<br />
#Load Data<br />
load(file="TFExample.RData")<br />
<br />
#Load CRIT functions<br />
source(file="CRIT.R")<br />
<br />
#Generate label for feature of interest - set x for column variable<br />
tLabel<-initializer(T[,x], type="median")<br />
<br />
#Determine set of targets sensitive to this feature<br />
DC<-discriminator(C, tLabel, multCorrect=TRUE)<br />
<br />
#Generate new label based on sensitivity identified in previous step<br />
gLabel<-labelSlicer(DC, .05)<br />
<br />
#Identify features that seem to discriminate between sens/insens targets<br />
DG<-discriminator(t(G), gLabel, multCorrect=TRUE)<br />
</pre><br />
<br />
===Output===<br />
<br />
Cross Patterns have a natural X relationship Y representation making a network representation an ideal way to visualize results.<br />
<br />
Cross patterns can easily be formatted in .sif for loading into various network browsers including<br />
<br />
[http://tyna.gersteinlab.org/tyna/ tYNA] or [http://www.cytoscape.org/ Cytoscape].<br />
<br />
[[File:results.png]]</div>Tarahttp://info.gersteinlab.org/File:Results.pngFile:Results.png2011-01-30T21:20:41Z<p>Tara: </p>
<hr />
<div></div>Tarahttp://info.gersteinlab.org/CRIT/workflowCRIT/workflow2011-01-30T21:17:00Z<p>Tara: /* Output */</p>
<hr />
<div>==Transcription Factor Example==<br />
<br />
===Motivation and Problem Set Up===<br />
<br />
Cis regulatory elements as a means of regulating gene expression have<br />
been extensively studied. However, beyond such motifs, are there<br />
inherent properties of the targets themselves that make them more or<br />
less likely to be regulated by a given class of transcription factors?<br />
As an example, do essential transcription factors preferentially<br />
regulate essential targets? Are there genome composition features<br />
such as GC or codon bias that influence which targets are regulated by<br />
which TFs? <br />
<br />
===Input Data===<br />
Here, we use three different datasets as shown.<br />
<br />
[[File:schema.png|200 px| thumb | left| Data Input Set up]]<br />
<br />
These objects are named as follows in the R dataset:<br />
<br />
(1) T: Transcription factors and their associated properties<br />
<br />
(2) C: Connector Matrix matching transcription factors to their associated targets<br />
<br />
(3) G: Gene targets and their associated properties<br />
<br />
T and G are both post processed from:<br />
<br />
Y. Xia, E. A. Franzosa, and M. B. Gerstein. Integrated assessment of genomic correlates of protein <br />
evolutionary rate. PLoS Comput Biol, 5(6):e1000413–e1000413, 2009.<br />
<br />
C is post processed from:<br />
<br />
C. T. Harbison, et al. Transcriptional <br />
regulatory code of a eukaryotic genome. Nature, 431(7004):99–104, 2004.<br />
<br />
As in Harbison et al, p<.001 was used to indicate a TF-gene target. We binarized the matrix such that any TF-gene pair with a pval<.001 had a 1 and anything greater than this had a 0.<br />
<br />
===Example Code===<br />
<br />
<br />
<pre><br />
<br />
#Load Data<br />
load(file="TFExample.RData")<br />
<br />
#Load CRIT functions<br />
source(file="CRIT.R")<br />
<br />
#Generate label for feature of interest - set x for column variable<br />
tLabel<-initializer(T[,x], type="median")<br />
<br />
#Determine set of targets sensitive to this feature<br />
DC<-discriminator(C, tLabel, multCorrect=TRUE)<br />
<br />
#Generate new label based on sensitivity identified in previous step<br />
gLabel<-labelSlicer(DC, .05)<br />
<br />
#Identify features that seem to discriminate between sens/insens targets<br />
DG<-discriminator(t(G), gLabel, multCorrect=TRUE)<br />
</pre><br />
<br />
===Output===<br />
<br />
Cross Patterns have a natural X relationship Y representation making a network representation an ideal way to visualize results.<br />
<br />
Cross patterns can easily be formatted in .sif for loading into various network browsers including<br />
<br />
[http://tyna.gersteinlab.org/tyna/ tYNA] or [http://www.cytoscape.org/ Cytoscape].</div>Tarahttp://info.gersteinlab.org/CRIT/workflowCRIT/workflow2011-01-30T21:16:41Z<p>Tara: </p>
<hr />
<div>==Transcription Factor Example==<br />
<br />
===Motivation and Problem Set Up===<br />
<br />
Cis regulatory elements as a means of regulating gene expression have<br />
been extensively studied. However, beyond such motifs, are there<br />
inherent properties of the targets themselves that make them more or<br />
less likely to be regulated by a given class of transcription factors?<br />
As an example, do essential transcription factors preferentially<br />
regulate essential targets? Are there genome composition features<br />
such as GC or codon bias that influence which targets are regulated by<br />
which TFs? <br />
<br />
===Input Data===<br />
Here, we use three different datasets as shown.<br />
<br />
[[File:schema.png|200 px| thumb | left| Data Input Set up]]<br />
<br />
These objects are named as follows in the R dataset:<br />
<br />
(1) T: Transcription factors and their associated properties<br />
<br />
(2) C: Connector Matrix matching transcription factors to their associated targets<br />
<br />
(3) G: Gene targets and their associated properties<br />
<br />
T and G are both post processed from:<br />
<br />
Y. Xia, E. A. Franzosa, and M. B. Gerstein. Integrated assessment of genomic correlates of protein <br />
evolutionary rate. PLoS Comput Biol, 5(6):e1000413–e1000413, 2009.<br />
<br />
C is post processed from:<br />
<br />
C. T. Harbison, et al. Transcriptional <br />
regulatory code of a eukaryotic genome. Nature, 431(7004):99–104, 2004.<br />
<br />
As in Harbison et al, p<.001 was used to indicate a TF-gene target. We binarized the matrix such that any TF-gene pair with a pval<.001 had a 1 and anything greater than this had a 0.<br />
<br />
===Example Code===<br />
<br />
<br />
<pre><br />
<br />
#Load Data<br />
load(file="TFExample.RData")<br />
<br />
#Load CRIT functions<br />
source(file="CRIT.R")<br />
<br />
#Generate label for feature of interest - set x for column variable<br />
tLabel<-initializer(T[,x], type="median")<br />
<br />
#Determine set of targets sensitive to this feature<br />
DC<-discriminator(C, tLabel, multCorrect=TRUE)<br />
<br />
#Generate new label based on sensitivity identified in previous step<br />
gLabel<-labelSlicer(DC, .05)<br />
<br />
#Identify features that seem to discriminate between sens/insens targets<br />
DG<-discriminator(t(G), gLabel, multCorrect=TRUE)<br />
</pre><br />
<br />
===Output===<br />
<br />
Cross Patterns have a natural X relationship Y representation making a network representation and ideal way to visualize results.<br />
<br />
Cross patterns can easily be formatted in .sif for loading into various network browsers including<br />
<br />
[http://tyna.gersteinlab.org/tyna/ tYNA] or [http://www.cytoscape.org/ Cytoscape].</div>Tarahttp://info.gersteinlab.org/CRIT/workflowCRIT/workflow2011-01-30T21:12:38Z<p>Tara: /* Example Code */</p>
<hr />
<div>==Transcription Factor Example==<br />
<br />
===Motivation and Problem Set Up===<br />
<br />
Cis regulatory elements as a means of regulating gene expression have<br />
been extensively studied. However, beyond such motifs, are there<br />
inherent properties of the targets themselves that make them more or<br />
less likely to be regulated by a given class of transcription factors?<br />
As an example, do essential transcription factors preferentially<br />
regulate essential targets? Are there genome composition features<br />
such as GC or codon bias that influence which targets are regulated by<br />
which TFs? <br />
<br />
===Input Data===<br />
Here, we use three different datasets as shown.<br />
<br />
[[File:schema.png|200 px| thumb | left| Data Input Set up]]<br />
<br />
These objects are named as follows in the R dataset:<br />
<br />
(1) T: Transcription factors and their associated properties<br />
<br />
(2) C: Connector Matrix matching transcription factors to their associated targets<br />
<br />
(3) G: Gene targets and their associated properties<br />
<br />
T and G are both post processed from:<br />
<br />
Y. Xia, E. A. Franzosa, and M. B. Gerstein. Integrated assessment of genomic correlates of protein <br />
evolutionary rate. PLoS Comput Biol, 5(6):e1000413–e1000413, 2009.<br />
<br />
C is post processed from:<br />
<br />
C. T. Harbison, et al. Transcriptional <br />
regulatory code of a eukaryotic genome. Nature, 431(7004):99–104, 2004.<br />
<br />
As in Harbison et al, p<.001 was used to indicate a TF-gene target. We binarized the matrix such that any TF-gene pair with a pval<.001 had a 1 and anything greater than this had a 0.<br />
<br />
===Example Code===<br />
<br />
<br />
<pre><br />
<br />
#Load Data<br />
load(file="TFExample.RData")<br />
<br />
#Load CRIT functions<br />
source(file="CRIT.R")<br />
<br />
#Generate label for feature of interest - set x for column variable<br />
tLabel<-initializer(T[,x], type="median")<br />
<br />
#Determine set of targets sensitive to this feature<br />
DC<-discriminator(C, tLabel, multCorrect=TRUE)<br />
<br />
#Generate new label based on sensitivity identified in previous step<br />
gLabel<-labelSlicer(DC, .05)<br />
<br />
#Identify features that seem to discriminate between sens/insens targets<br />
DG<-discriminator(t(G), gLabel, multCorrect=TRUE)<br />
</pre></div>Tarahttp://info.gersteinlab.org/CRIT/workflowCRIT/workflow2011-01-30T21:12:16Z<p>Tara: </p>
<hr />
<div>==Transcription Factor Example==<br />
<br />
===Motivation and Problem Set Up===<br />
<br />
Cis regulatory elements as a means of regulating gene expression have<br />
been extensively studied. However, beyond such motifs, are there<br />
inherent properties of the targets themselves that make them more or<br />
less likely to be regulated by a given class of transcription factors?<br />
As an example, do essential transcription factors preferentially<br />
regulate essential targets? Are there genome composition features<br />
such as GC or codon bias that influence which targets are regulated by<br />
which TFs? <br />
<br />
===Input Data===<br />
Here, we use three different datasets as shown.<br />
<br />
[[File:schema.png|200 px| thumb | left| Data Input Set up]]<br />
<br />
These objects are named as follows in the R dataset:<br />
<br />
(1) T: Transcription factors and their associated properties<br />
<br />
(2) C: Connector Matrix matching transcription factors to their associated targets<br />
<br />
(3) G: Gene targets and their associated properties<br />
<br />
T and G are both post processed from:<br />
<br />
Y. Xia, E. A. Franzosa, and M. B. Gerstein. Integrated assessment of genomic correlates of protein <br />
evolutionary rate. PLoS Comput Biol, 5(6):e1000413–e1000413, 2009.<br />
<br />
C is post processed from:<br />
<br />
C. T. Harbison, et al. Transcriptional <br />
regulatory code of a eukaryotic genome. Nature, 431(7004):99–104, 2004.<br />
<br />
As in Harbison et al, p<.001 was used to indicate a TF-gene target. We binarized the matrix such that any TF-gene pair with a pval<.001 had a 1 and anything greater than this had a 0.<br />
<br />
===Example Code===<br />
<br />
<pre><br />
#Example Code<br />
<br />
#Load Data<br />
load(file="TFExample.RData")<br />
<br />
#Load CRIT functions<br />
source(file="CRIT.R")<br />
<br />
#Generate label for feature of interest - set x for column variable<br />
tLabel<-initializer(T[,x], type="median")<br />
<br />
#Determine set of targets sensitive to this feature<br />
DC<-discriminator(C, tLabel, multCorrect=TRUE)<br />
<br />
#Generate new label based on sensitivity identified in previous step<br />
gLabel<-labelSlicer(DC, .05)<br />
<br />
#Identify features that seem to discriminate between sens/insens targets<br />
DG<-discriminator(t(G), gLabel, multCorrect=TRUE)<br />
</pre></div>Tarahttp://info.gersteinlab.org/CRIT/workflowCRIT/workflow2011-01-29T17:30:52Z<p>Tara: /* Input Data */</p>
<hr />
<div>==Transcription Factor Example==<br />
<br />
===Motivation and Problem Set Up===<br />
<br />
Cis regulatory elements as a means of regulating gene expression have<br />
been extensively studied. However, beyond such motifs, are there<br />
inherent properties of the targets themselves that make them more or<br />
less likely to be regulated by a given class of transcription factors?<br />
As an example, do essential transcription factors preferentially<br />
regulate essential targets? Are there genome composition features<br />
such as GC or codon bias that influence which targets are regulated by<br />
which TFs? <br />
<br />
===Input Data===<br />
Here, we use three different datasets as shown.<br />
<br />
[[File:schema.png|200 px| thumb | left| Data Input Set up]]<br />
<br />
These objects are named as follows in the R dataset:<br />
<br />
(1) T: Transcription factors and their associated properties<br />
<br />
(2) C: Connector Matrix matching transcription factors to their associated targets<br />
<br />
(3) G: Gene targets and their associated properties<br />
<br />
T and G are both post processed from:<br />
<br />
Y. Xia, E. A. Franzosa, and M. B. Gerstein. Integrated assessment of genomic correlates of protein <br />
evolutionary rate. PLoS Comput Biol, 5(6):e1000413–e1000413, 2009.<br />
<br />
C is post processed from:<br />
<br />
C. T. Harbison, et al. Transcriptional <br />
regulatory code of a eukaryotic genome. Nature, 431(7004):99–104, 2004.<br />
<br />
As in Harbison et al, p<.001 was used to indicate a TF-gene target. We binarized the matrix such that any TF-gene pair with a pval<.001 had a 1 and anything greater than this had a 0.</div>Tarahttp://info.gersteinlab.org/CRIT/workflowCRIT/workflow2011-01-29T16:59:18Z<p>Tara: /* Input Data */</p>
<hr />
<div>==Transcription Factor Example==<br />
<br />
===Motivation and Problem Set Up===<br />
<br />
Cis regulatory elements as a means of regulating gene expression have<br />
been extensively studied. However, beyond such motifs, are there<br />
inherent properties of the targets themselves that make them more or<br />
less likely to be regulated by a given class of transcription factors?<br />
As an example, do essential transcription factors preferentially<br />
regulate essential targets? Are there genome composition features<br />
such as GC or codon bias that influence which targets are regulated by<br />
which TFs? <br />
<br />
===Input Data===<br />
Here, we use three different datasets as shown.<br />
<br />
[[File:schema.png|200 px| thumb | left| Data Input Set up]]<br />
<br />
These objects are named as follows in the R dataset:<br />
<br />
(1) T: Transcription factors and their associated properties<br />
<br />
(2) C: Connector Matrix matching transcription factors to their associated targets<br />
<br />
(3) G: Gene targets and their associated properties<br />
<br />
T and G are both post processed from:<br />
<br />
Y. Xia, E. A. Franzosa, and M. B. Gerstein. Integrated assessment of genomic correlates of protein <br />
evolutionary rate. PLoS Comput Biol, 5(6):e1000413–e1000413, 2009.<br />
<br />
C is post processed from:<br />
<br />
C. T. Harbison, et al. Transcriptional <br />
regulatory code of a eukaryotic genome. Nature, 431(7004):99–104, 2004.</div>Tarahttp://info.gersteinlab.org/CRIT/workflowCRIT/workflow2011-01-29T16:58:38Z<p>Tara: </p>
<hr />
<div>==Transcription Factor Example==<br />
<br />
===Motivation and Problem Set Up===<br />
<br />
Cis regulatory elements as a means of regulating gene expression have<br />
been extensively studied. However, beyond such motifs, are there<br />
inherent properties of the targets themselves that make them more or<br />
less likely to be regulated by a given class of transcription factors?<br />
As an example, do essential transcription factors preferentially<br />
regulate essential targets? Are there genome composition features<br />
such as GC or codon bias that influence which targets are regulated by<br />
which TFs? <br />
<br />
===Input Data===<br />
Here, we use three different datasets as shown.<br />
<br />
[[File:schema.png|Data Input Set up]]<br />
<br />
These objects are named as follows in the R dataset:<br />
<br />
(1) T: Transcription factors and their associated properties<br />
<br />
(2) C: Connector Matrix matching transcription factors to their associated targets<br />
<br />
(3) G: Gene targets and their associated properties<br />
<br />
T and G are both post processed from:<br />
<br />
Y. Xia, E. A. Franzosa, and M. B. Gerstein. Integrated assessment of genomic correlates of protein <br />
evolutionary rate. PLoS Comput Biol, 5(6):e1000413–e1000413, 2009.<br />
<br />
C is post processed from:<br />
<br />
C. T. Harbison, et al. Transcriptional <br />
regulatory code of a eukaryotic genome. Nature, 431(7004):99–104, 2004.</div>Tarahttp://info.gersteinlab.org/CRITCRIT2011-01-29T16:38:40Z<p>Tara: /* Cross Pattern Identification Technique (CRIT) */</p>
<hr />
<div>=Cross Pattern Identification Technique (CRIT)=<br />
<br />
==Algorithm Overview==<br />
<br />
Label, Slice, Discriminate, Repeat.<br />
<br />
[[File:alg.png]]<br />
<br />
<br />
==Core Functions and Their Parameters==<br />
<br />
===Initializer===<br />
Initializer: Only run the first time to obtain some set of labels for the rows of A. In all remaining steps, the discriminator generates the new label for propogation. Alternatively, this step can be skipped and a label can be supplied directly to the labeler function.<br />
<br />
Required Arguments: Column of A, type of partitioning<br />
<br />
Output: Vector assigning a label to every ROW of Column of A<br />
<br />
<pre><br />
initializer<-function(A, type=c("median","mean")) {<br />
#Create empty vector with the same number of rows of A<br />
label<-matrix(0, length(A))<br />
#Get value to threshold off of<br />
t<-getThreshold(A, type=type)<br />
#Create label<br />
label<-labelSlicer(A ,t) <br />
return (label)<br />
}<br />
<br />
</pre><br />
<br />
<br />
===Labeler===<br />
In implementation both labeler and slicer (labelSlicer) are integrated as the label is a simple column vector. We show the breakdown to make the connection between the algorithm design and implementation more transparent.<br />
<br />
Labeler: Transfers labels on columns of previous datasets (A) to rows of new dataset (B)<br />
<br />
Required Arguments: Matrix (B), vector assigning a label to every ROW of A<br />
<br />
Output: Vector assigning a label to every COLUMN of B<br />
<br />
===Slicer===<br />
Slicer: Partition rows of N into "slices" based on labels from A<br />
<br />
Required Arguments: Matrix (B), Vector assigning a label to every COLUMN of B<br />
<br />
Output: Set of slices where slice is defined as a set of rows of B all containing the same label (in practice this is just the grouping variable as opposed to actual data slices)<br />
<br />
<pre><br />
labelSlicer<-function(values, t){<br />
#Set those strictly greater than threshold to 1<br />
index<-which(values>=t)<br />
label(index)<-1<br />
return (label) <br />
}<br />
</pre><br />
<br />
<br />
===Discriminator===<br />
Discriminator: Evaluates the discriminatory power of labels generated from B in the context of C. Set of slices where slice is defined as a set of rows of B all containing the same label. A random label would show no difference in the slices. (In practice this is implemented as an index instead of the literal data slice).<br />
<br />
Required Arguments: Matrix (B), label<br />
<br />
Optional Arguments: set to TRUE to compute the FDR<br />
<br />
Output: Value of test for every column of B<br />
<br />
<pre><br />
#In practice this is implemented as a index<br />
#Additional discriminator functions that use KS test, hypergeometric<br />
#distribution, etc are possible<br />
discriminator<-function(mat=B, label=l, multCorrect=FALSE){<br />
<br />
#Create empty list<br />
ttest_struct<-list();<br />
<br />
#Compute value of test for each row<br />
for(i in 1:nrow(X)){<br />
temp<-t.test(as.numeric(B[i,])~label)<br />
ttest_struct[[i]]=temp;<br />
}<br />
<br />
#Extract pvalue from structure<br />
all_pval<-getPvalfromStruct(ttest_struct);<br />
<br />
#Test if we want to compute FDR<br />
if (multCorrect) { all_pval<-p.adjust(all_pval, method="fdr") }<br />
<br />
return (all_pval)<br />
}<br />
<br />
<br />
</pre><br />
<br />
==AUXILLIARY FUNCTIONS==<br />
<br />
===Checking against Threshold===<br />
<pre><br />
Returns a value to split on for the base case<br />
getThreshold<-function(A, type=c("median","mean")) {<br />
type <- match.arg(type)<br />
threshold <- switch(type,<br />
mean = mean(A),<br />
median = median(A) <br />
)<br />
return (threshold)<br />
}<br />
<br />
</pre><br />
<br />
===Return t-test Object===<br />
<pre><br />
Returns only the t-test from the ttest object<br />
getTfromStruct<-function(struct){<br />
p<-unlist(lapply(struct, "[",1));<br />
return(p)<br />
}<br />
</pre><br />
<br />
===Return p Value Object===<br />
<pre><br />
Returns only the pvalues from the ttest object<br />
getPvalfromStruct<-function(struct){<br />
p<-unlist(lapply(struct, "[",3));<br />
return(p)<br />
}<br />
</pre></div>Tarahttp://info.gersteinlab.org/CRITCRIT2011-01-29T16:38:13Z<p>Tara: /* Cross Pattern Identification Technique (CRIT) */</p>
<hr />
<div>=Cross Pattern Identification Technique (CRIT)=<br />
<br />
Label, Slice, Discriminate, Repeat.<br />
<br />
[[File:alg.png]]<br />
<br />
<br />
==Core Functions and Their Parameters==<br />
<br />
===Initializer===<br />
Initializer: Only run the first time to obtain some set of labels for the rows of A. In all remaining steps, the discriminator generates the new label for propogation. Alternatively, this step can be skipped and a label can be supplied directly to the labeler function.<br />
<br />
Required Arguments: Column of A, type of partitioning<br />
<br />
Output: Vector assigning a label to every ROW of Column of A<br />
<br />
<pre><br />
initializer<-function(A, type=c("median","mean")) {<br />
#Create empty vector with the same number of rows of A<br />
label<-matrix(0, length(A))<br />
#Get value to threshold off of<br />
t<-getThreshold(A, type=type)<br />
#Create label<br />
label<-labelSlicer(A ,t) <br />
return (label)<br />
}<br />
<br />
</pre><br />
<br />
<br />
===Labeler===<br />
In implementation both labeler and slicer (labelSlicer) are integrated as the label is a simple column vector. We show the breakdown to make the connection between the algorithm design and implementation more transparent.<br />
<br />
Labeler: Transfers labels on columns of previous datasets (A) to rows of new dataset (B)<br />
<br />
Required Arguments: Matrix (B), vector assigning a label to every ROW of A<br />
<br />
Output: Vector assigning a label to every COLUMN of B<br />
<br />
===Slicer===<br />
Slicer: Partition rows of N into "slices" based on labels from A<br />
<br />
Required Arguments: Matrix (B), Vector assigning a label to every COLUMN of B<br />
<br />
Output: Set of slices where slice is defined as a set of rows of B all containing the same label (in practice this is just the grouping variable as opposed to actual data slices)<br />
<br />
<pre><br />
labelSlicer<-function(values, t){<br />
#Set those strictly greater than threshold to 1<br />
index<-which(values>=t)<br />
label(index)<-1<br />
return (label) <br />
}<br />
</pre><br />
<br />
<br />
===Discriminator===<br />
Discriminator: Evaluates the discriminatory power of labels generated from B in the context of C. Set of slices where slice is defined as a set of rows of B all containing the same label. A random label would show no difference in the slices. (In practice this is implemented as an index instead of the literal data slice).<br />
<br />
Required Arguments: Matrix (B), label<br />
<br />
Optional Arguments: set to TRUE to compute the FDR<br />
<br />
Output: Value of test for every column of B<br />
<br />
<pre><br />
#In practice this is implemented as a index<br />
#Additional discriminator functions that use KS test, hypergeometric<br />
#distribution, etc are possible<br />
discriminator<-function(mat=B, label=l, multCorrect=FALSE){<br />
<br />
#Create empty list<br />
ttest_struct<-list();<br />
<br />
#Compute value of test for each row<br />
for(i in 1:nrow(X)){<br />
temp<-t.test(as.numeric(B[i,])~label)<br />
ttest_struct[[i]]=temp;<br />
}<br />
<br />
#Extract pvalue from structure<br />
all_pval<-getPvalfromStruct(ttest_struct);<br />
<br />
#Test if we want to compute FDR<br />
if (multCorrect) { all_pval<-p.adjust(all_pval, method="fdr") }<br />
<br />
return (all_pval)<br />
}<br />
<br />
<br />
</pre><br />
<br />
==AUXILLIARY FUNCTIONS==<br />
<br />
===Checking against Threshold===<br />
<pre><br />
Returns a value to split on for the base case<br />
getThreshold<-function(A, type=c("median","mean")) {<br />
type <- match.arg(type)<br />
threshold <- switch(type,<br />
mean = mean(A),<br />
median = median(A) <br />
)<br />
return (threshold)<br />
}<br />
<br />
</pre><br />
<br />
===Return t-test Object===<br />
<pre><br />
Returns only the t-test from the ttest object<br />
getTfromStruct<-function(struct){<br />
p<-unlist(lapply(struct, "[",1));<br />
return(p)<br />
}<br />
</pre><br />
<br />
===Return p Value Object===<br />
<pre><br />
Returns only the pvalues from the ttest object<br />
getPvalfromStruct<-function(struct){<br />
p<-unlist(lapply(struct, "[",3));<br />
return(p)<br />
}<br />
</pre></div>Tarahttp://info.gersteinlab.org/CRITCRIT2011-01-29T16:37:47Z<p>Tara: /* Cross Pattern Identification Technique (CRIT) */</p>
<hr />
<div>=Cross Pattern Identification Technique (CRIT)=<br />
<br />
Label, Slice, Discriminate, Repeat.<br />
<br />
[[File:alg.png|200px|thumb]]<br />
<br />
<br />
==Core Functions and Their Parameters==<br />
<br />
===Initializer===<br />
Initializer: Only run the first time to obtain some set of labels for the rows of A. In all remaining steps, the discriminator generates the new label for propogation. Alternatively, this step can be skipped and a label can be supplied directly to the labeler function.<br />
<br />
Required Arguments: Column of A, type of partitioning<br />
<br />
Output: Vector assigning a label to every ROW of Column of A<br />
<br />
<pre><br />
initializer<-function(A, type=c("median","mean")) {<br />
#Create empty vector with the same number of rows of A<br />
label<-matrix(0, length(A))<br />
#Get value to threshold off of<br />
t<-getThreshold(A, type=type)<br />
#Create label<br />
label<-labelSlicer(A ,t) <br />
return (label)<br />
}<br />
<br />
</pre><br />
<br />
<br />
===Labeler===<br />
In implementation both labeler and slicer (labelSlicer) are integrated as the label is a simple column vector. We show the breakdown to make the connection between the algorithm design and implementation more transparent.<br />
<br />
Labeler: Transfers labels on columns of previous datasets (A) to rows of new dataset (B)<br />
<br />
Required Arguments: Matrix (B), vector assigning a label to every ROW of A<br />
<br />
Output: Vector assigning a label to every COLUMN of B<br />
<br />
===Slicer===<br />
Slicer: Partition rows of N into "slices" based on labels from A<br />
<br />
Required Arguments: Matrix (B), Vector assigning a label to every COLUMN of B<br />
<br />
Output: Set of slices where slice is defined as a set of rows of B all containing the same label (in practice this is just the grouping variable as opposed to actual data slices)<br />
<br />
<pre><br />
labelSlicer<-function(values, t){<br />
#Set those strictly greater than threshold to 1<br />
index<-which(values>=t)<br />
label(index)<-1<br />
return (label) <br />
}<br />
</pre><br />
<br />
<br />
===Discriminator===<br />
Discriminator: Evaluates the discriminatory power of labels generated from B in the context of C. Set of slices where slice is defined as a set of rows of B all containing the same label. A random label would show no difference in the slices. (In practice this is implemented as an index instead of the literal data slice).<br />
<br />
Required Arguments: Matrix (B), label<br />
<br />
Optional Arguments: set to TRUE to compute the FDR<br />
<br />
Output: Value of test for every column of B<br />
<br />
<pre><br />
#In practice this is implemented as a index<br />
#Additional discriminator functions that use KS test, hypergeometric<br />
#distribution, etc are possible<br />
discriminator<-function(mat=B, label=l, multCorrect=FALSE){<br />
<br />
#Create empty list<br />
ttest_struct<-list();<br />
<br />
#Compute value of test for each row<br />
for(i in 1:nrow(X)){<br />
temp<-t.test(as.numeric(B[i,])~label)<br />
ttest_struct[[i]]=temp;<br />
}<br />
<br />
#Extract pvalue from structure<br />
all_pval<-getPvalfromStruct(ttest_struct);<br />
<br />
#Test if we want to compute FDR<br />
if (multCorrect) { all_pval<-p.adjust(all_pval, method="fdr") }<br />
<br />
return (all_pval)<br />
}<br />
<br />
<br />
</pre><br />
<br />
==AUXILLIARY FUNCTIONS==<br />
<br />
===Checking against Threshold===<br />
<pre><br />
Returns a value to split on for the base case<br />
getThreshold<-function(A, type=c("median","mean")) {<br />
type <- match.arg(type)<br />
threshold <- switch(type,<br />
mean = mean(A),<br />
median = median(A) <br />
)<br />
return (threshold)<br />
}<br />
<br />
</pre><br />
<br />
===Return t-test Object===<br />
<pre><br />
Returns only the t-test from the ttest object<br />
getTfromStruct<-function(struct){<br />
p<-unlist(lapply(struct, "[",1));<br />
return(p)<br />
}<br />
</pre><br />
<br />
===Return p Value Object===<br />
<pre><br />
Returns only the pvalues from the ttest object<br />
getPvalfromStruct<-function(struct){<br />
p<-unlist(lapply(struct, "[",3));<br />
return(p)<br />
}<br />
</pre></div>Tarahttp://info.gersteinlab.org/File:Alg.pngFile:Alg.png2011-01-29T16:37:00Z<p>Tara: </p>
<hr />
<div></div>Tarahttp://info.gersteinlab.org/CRIT/workflowCRIT/workflow2011-01-29T14:43:24Z<p>Tara: </p>
<hr />
<div>==Transcription Factor Example==<br />
<br />
===Motivation and Problem Set Up===<br />
<br />
Cis regulatory elements as a means of regulating gene expression have<br />
been extensively studied. However, beyond such motifs, are there<br />
inherent properties of the targets themselves that make them more or<br />
less likely to be regulated by a given class of transcription factors?<br />
As an example, do essential transcription factors preferentially<br />
regulate essential targets? Are there genome composition features<br />
such as GC or codon bias that influence which targets are regulated by<br />
which TFs? <br />
<br />
===Input Data===<br />
Here, we use three different datasets as shown.<br />
<br />
[[File:schema.png|200px|thumb|left|Data Input Set up]]<br />
<br />
These objects are named as follows in the R dataset:<br />
<br />
(1) T: Transcription factors and their associated properties<br />
<br />
(2) C: Connector Matrix matching transcription factors to their associated targets<br />
<br />
(3) G: Gene targets and their associated properties</div>Tarahttp://info.gersteinlab.org/CRIT/workflowCRIT/workflow2011-01-29T14:42:45Z<p>Tara: </p>
<hr />
<div>==Transcription Factor Example==<br />
<br />
===Motivation and Problem Set up <br />
<br />
Cis regulatory elements as a means of regulating gene expression have<br />
been extensively studied. However, beyond such motifs, are there<br />
inherent properties of the targets themselves that make them more or<br />
less likely to be regulated by a given class of transcription factors?<br />
As an example, do essential transcription factors preferentially<br />
regulate essential targets? Are there genome composition features<br />
such as GC or codon bias that influence which targets are regulated by<br />
which TFs? <br />
<br />
===Input Data===<br />
Here, we use three different datasets as shown.<br />
<br />
[[File:schema.png|200px|thumb|left|Data Input Set up]]<br />
<br />
These objects are named as follows in the R dataset.<br />
(1) T: Transcription factors and their associated properties<br />
<br />
(2) C: Connector Matrix matching transcription factors to their associated targets<br />
<br />
(3) G: Gene targets and their associated properties</div>Tarahttp://info.gersteinlab.org/CRIT/workflowCRIT/workflow2011-01-29T14:35:05Z<p>Tara: </p>
<hr />
<div>==Transcription Factor Example==<br />
<br />
===Input Data===<br />
<br />
[[File:schema.png|200px|thumb|left|Data Input Set up]]</div>Tarahttp://info.gersteinlab.org/CRIT/workflowCRIT/workflow2011-01-29T14:34:37Z<p>Tara: </p>
<hr />
<div>==Transcription Factor Example==<br />
<br />
===Input Data===<br />
<br />
[[File:schema.png|200px|thumb|left|alt text]]</div>Tarahttp://info.gersteinlab.org/File:Schema.pngFile:Schema.png2011-01-29T14:33:45Z<p>Tara: </p>
<hr />
<div></div>Tarahttp://info.gersteinlab.org/CRIT/workflowCRIT/workflow2011-01-28T21:10:15Z<p>Tara: </p>
<hr />
<div>==Transcription Factor Example==<br />
<br />
===Input Data===</div>Tarahttp://info.gersteinlab.org/CRIT/codeCRIT/code2011-01-28T20:48:42Z<p>Tara: </p>
<hr />
<div><center>[http://archive.gersteinlab.org/proj/crit/ '''CRIT Main Page''']</center><br />
<br />
__NOTOC__<br />
<br />
== Code ==<br />
<br />
=== Required Software - External ===<br />
<br />
# This is an R package.<br />
<br />
<br><br />
<br />
=== Download ===<br />
<br />
==== Source code ====<br />
A TAR ball that contains the source code for these two components can be downloaded here:<br />
* [http://archive.gersteinlab.org/proj/crit CRIT-0.1.tar.gz]<br />
<br />
<pre><br />
Important Note<br />
==============<br />
<br />
THIS PACKAGE CRIT IS PROVIDED "AS IS" AND WITHOUT ANY EXPRESSED OR IMPLIED<br />
WARRANTIES, INCLUDING, WITHOUT LIMITATION, THE IMPLIED WARRANTIES<br />
OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE.<br />
</pre><br />
<br />
<br><br />
<br />
=== License information ===<br />
<br />
The software package is released under the [http://creativecommons.org/licenses/by-nc/2.5/legalcode Creative Commons license (Attribution-NonCommerical)]. <br><br />
For more details please refer to the [http://www.gersteinlab.org/misc/permissions.html Permissions Page] on the Gerstein Lab webpage.</div>Tarahttp://info.gersteinlab.org/CRITCRIT2011-01-26T16:40:25Z<p>Tara: /* Discriminator */</p>
<hr />
<div>=Cross Pattern Identification Technique (CRIT)=<br />
<br />
Label, Slice, Discriminate, Repeat.<br />
<br />
==Core Functions and Their Parameters==<br />
<br />
===Initializer===<br />
Initializer: Only run the first time to obtain some set of labels for the rows of A. In all remaining steps, the discriminator generates the new label for propogation. Alternatively, this step can be skipped and a label can be supplied directly to the labeler function.<br />
<br />
Required Arguments: Column of A, type of partitioning<br />
<br />
Output: Vector assigning a label to every ROW of Column of A<br />
<br />
<pre><br />
initializer<-function(A, type=c("median","mean")) {<br />
#Create empty vector with the same number of rows of A<br />
label<-matrix(0, length(A))<br />
#Get value to threshold off of<br />
t<-getThreshold(A, type=type)<br />
#Create label<br />
label<-labelSlicer(A ,t) <br />
return (label)<br />
}<br />
<br />
</pre><br />
<br />
<br />
===Labeler===<br />
In implementation both labeler and slicer (labelSlicer) are integrated as the label is a simple column vector. We show the breakdown to make the connection between the algorithm design and implementation more transparent.<br />
<br />
Labeler: Transfers labels on columns of previous datasets (A) to rows of new dataset (B)<br />
<br />
Required Arguments: Matrix (B), vector assigning a label to every ROW of A<br />
<br />
Output: Vector assigning a label to every COLUMN of B<br />
<br />
===Slicer===<br />
Slicer: Partition rows of N into "slices" based on labels from A<br />
<br />
Required Arguments: Matrix (B), Vector assigning a label to every COLUMN of B<br />
<br />
Output: Set of slices where slice is defined as a set of rows of B all containing the same label (in practice this is just the grouping variable as opposed to actual data slices)<br />
<br />
<pre><br />
labelSlicer<-function(values, t){<br />
#Set those strictly greater than threshold to 1<br />
index<-which(values>=t)<br />
label(index)<-1<br />
return (label) <br />
}<br />
</pre><br />
<br />
<br />
===Discriminator===<br />
Discriminator: Evaluates the discriminatory power of labels generated from B in the context of C. Set of slices where slice is defined as a set of rows of B all containing the same label. A random label would show no difference in the slices. (In practice this is implemented as an index instead of the literal data slice).<br />
<br />
Required Arguments: Matrix (B), label<br />
<br />
Optional Arguments: set to TRUE to compute the FDR<br />
<br />
Output: Value of test for every column of B<br />
<br />
<pre><br />
#In practice this is implemented as a index<br />
#Additional discriminator functions that use KS test, hypergeometric<br />
#distribution, etc are possible<br />
discriminator<-function(mat=B, label=l, multCorrect=FALSE){<br />
<br />
#Create empty list<br />
ttest_struct<-list();<br />
<br />
#Compute value of test for each row<br />
for(i in 1:nrow(X)){<br />
temp<-t.test(as.numeric(B[i,])~label)<br />
ttest_struct[[i]]=temp;<br />
}<br />
<br />
#Extract pvalue from structure<br />
all_pval<-getPvalfromStruct(ttest_struct);<br />
<br />
#Test if we want to compute FDR<br />
if (multCorrect) { all_pval<-p.adjust(all_pval, method="fdr") }<br />
<br />
return (all_pval)<br />
}<br />
<br />
<br />
</pre><br />
<br />
==AUXILLIARY FUNCTIONS==<br />
<br />
===Checking against Threshold===<br />
<pre><br />
Returns a value to split on for the base case<br />
getThreshold<-function(A, type=c("median","mean")) {<br />
type <- match.arg(type)<br />
threshold <- switch(type,<br />
mean = mean(A),<br />
median = median(A) <br />
)<br />
return (threshold)<br />
}<br />
<br />
</pre><br />
<br />
===Return t-test Object===<br />
<pre><br />
Returns only the t-test from the ttest object<br />
getTfromStruct<-function(struct){<br />
p<-unlist(lapply(struct, "[",1));<br />
return(p)<br />
}<br />
</pre><br />
<br />
===Return p Value Object===<br />
<pre><br />
Returns only the pvalues from the ttest object<br />
getPvalfromStruct<-function(struct){<br />
p<-unlist(lapply(struct, "[",3));<br />
return(p)<br />
}<br />
</pre></div>Tarahttp://info.gersteinlab.org/CRITCRIT2011-01-26T13:41:09Z<p>Tara: Created page with '=Cross Pattern Identification Technique (CRIT)= Label, Slice, Discriminate, Repeat. ==Core Functions and Their Parameters== ===Initializer=== Initializer: Only run the first t…'</p>
<hr />
<div>=Cross Pattern Identification Technique (CRIT)=<br />
<br />
Label, Slice, Discriminate, Repeat.<br />
<br />
==Core Functions and Their Parameters==<br />
<br />
===Initializer===<br />
Initializer: Only run the first time to obtain some set of labels for the rows of A. In all remaining steps, the discriminator generates the new label for propogation. Alternatively, this step can be skipped and a label can be supplied directly to the labeler function.<br />
<br />
Required Arguments: Column of A, type of partitioning<br />
<br />
Output: Vector assigning a label to every ROW of Column of A<br />
<br />
<pre><br />
initializer<-function(A, type=c("median","mean")) {<br />
#Create empty vector with the same number of rows of A<br />
label<-matrix(0, length(A))<br />
#Get value to threshold off of<br />
t<-getThreshold(A, type=type)<br />
#Create label<br />
label<-labelSlicer(A ,t) <br />
return (label)<br />
}<br />
<br />
</pre><br />
<br />
<br />
===Labeler===<br />
In implementation both labeler and slicer (labelSlicer) are integrated as the label is a simple column vector. We show the breakdown to make the connection between the algorithm design and implementation more transparent.<br />
<br />
Labeler: Transfers labels on columns of previous datasets (A) to rows of new dataset (B)<br />
<br />
Required Arguments: Matrix (B), vector assigning a label to every ROW of A<br />
<br />
Output: Vector assigning a label to every COLUMN of B<br />
<br />
===Slicer===<br />
Slicer: Partition rows of N into "slices" based on labels from A<br />
<br />
Required Arguments: Matrix (B), Vector assigning a label to every COLUMN of B<br />
<br />
Output: Set of slices where slice is defined as a set of rows of B all containing the same label (in practice this is just the grouping variable as opposed to actual data slices)<br />
<br />
<pre><br />
labelSlicer<-function(values, t){<br />
#Set those strictly greater than threshold to 1<br />
index<-which(values>=t)<br />
label(index)<-1<br />
return (label) <br />
}<br />
</pre><br />
<br />
<br />
===Discriminator===<br />
Discriminator: Evaluates the discriminatory power of labels generated from B in the context of C. Set of slices where slice is defined as a set of rows of B all containing the same label. A random label would show no difference in the slices. (In practice this is implemented as an index instead of the literal data slice).<br />
<br />
Required Arguments: Matrix (B), label<br />
<br />
Optional Arguments: set to TRUE to compute the FDR<br />
<br />
Output: Value of test for every column of B<br />
<br />
<pre><br />
#In practice this is implemented as a index<br />
#Additional discriminator functions that use KS test, hypergeometric<br />
#distribution, etc are possible<br />
discriminator<-function(mat=B, testLabel=l, multCorrect=FALSE){<br />
<br />
#Create empty list<br />
ttest_struct<-list();<br />
<br />
#Compute value of test for each row<br />
for(i in 1:nrow(X)){<br />
temp<-t.test(as.numeric(B[i,])~label)<br />
ttest_struct[[i]]=temp;<br />
}<br />
<br />
#Extract pvalue from structure<br />
all_pval<-getPvalfromStruct(ttest_struct);<br />
<br />
#Test if we want to compute FDR<br />
if (multcorrect) { all_pval<-p.adjust(all_pval, method="fdr") }<br />
<br />
return (all_pval)<br />
}<br />
<br />
<br />
</pre><br />
<br />
<br />
==AUXILLIARY FUNCTIONS==<br />
<br />
===Checking against Threshold===<br />
<pre><br />
Returns a value to split on for the base case<br />
getThreshold<-function(A, type=c("median","mean")) {<br />
type <- match.arg(type)<br />
threshold <- switch(type,<br />
mean = mean(A),<br />
median = median(A) <br />
)<br />
return (threshold)<br />
}<br />
<br />
</pre><br />
<br />
===Return t-test Object===<br />
<pre><br />
Returns only the t-test from the ttest object<br />
getTfromStruct<-function(struct){<br />
p<-unlist(lapply(struct, "[",1));<br />
return(p)<br />
}<br />
</pre><br />
<br />
===Return p Value Object===<br />
<pre><br />
Returns only the pvalues from the ttest object<br />
getPvalfromStruct<-function(struct){<br />
p<-unlist(lapply(struct, "[",3));<br />
return(p)<br />
}<br />
</pre></div>Tarahttp://info.gersteinlab.org/CRIT/workflowCRIT/workflow2011-01-24T20:26:59Z<p>Tara: </p>
<hr />
<div>==Breast Cancer Example==<br />
<br />
===Input Data===<br />
<br />
p-value.xls: The binding peaks of ER using ChIP-chip download from Myles Brown's Lab at: http://research4.dfci.harvard.edu/brownlab//datasets/index.php?dir=ER_MCF7_whole_human_genome/<br />
<br />
ER_putative_target_genes.txt: map peak to genes, each rows is a peak; all genes associated are listed in a single row separated using "$"<br />
VeerData-ID.txt: microarray expression data for breast cancer<br />
Veer-Info.txt: sample information<br />
<br />
UNDER CONSTRUCTION</div>Tarahttp://info.gersteinlab.org/CRIT/CRIT/2011-01-24T20:05:17Z<p>Tara: /* AUXILLIARY FUNCTIONS */</p>
<hr />
<div>=Cross Pattern Identification Technique (CRIT)=<br />
<br />
Label, Slice, Discriminate, Repeat.<br />
<br />
==Core Functions and Their Parameters==<br />
<br />
===Initializer===<br />
Initializer: Only run the first time to obtain some set of labels for the rows of A. In all remaining steps, the discriminator generates the new label for propogation. Alternatively, this step can be skipped and a label can be supplied directly to the labeler function.<br />
<br />
Required Arguments: Column of A, type of partitioning<br />
<br />
Output: Vector assigning a label to every ROW of Column of A<br />
<br />
<pre><br />
initializer<-function(A, type=c("median","mean")) {<br />
#Create empty vector with the same number of rows of A<br />
label<-matrix(0, length(A))<br />
#Get value to threshold off of<br />
t<-getThreshold(A, type=type)<br />
#Create label<br />
label<-labelSlicer(A ,t) <br />
return (label)<br />
}<br />
<br />
</pre><br />
<br />
<br />
===Labeler===<br />
In implementation both labeler and slicer (labelSlicer) are integrated as the label is a simple column vector. We show the breakdown to make the connection between the algorithm design and implementation more transparent.<br />
<br />
Labeler: Transfers labels on columns of previous datasets (A) to rows of new dataset (B)<br />
<br />
Required Arguments: Matrix (B), vector assigning a label to every ROW of A<br />
<br />
Output: Vector assigning a label to every COLUMN of B<br />
<br />
===Slicer===<br />
Slicer: Partition rows of N into "slices" based on labels from A<br />
<br />
Required Arguments: Matrix (B), Vector assigning a label to every COLUMN of B<br />
<br />
Output: Set of slices where slice is defined as a set of rows of B all containing the same label (in practice this is just the grouping variable as opposed to actual data slices)<br />
<br />
<pre><br />
labelSlicer<-function(values, t){<br />
#Set those strictly greater than threshold to 1<br />
index<-which(values>=t)<br />
label(index)<-1<br />
return (label) <br />
}<br />
</pre><br />
<br />
<br />
===Discriminator===<br />
Discriminator: Evaluates the discriminatory power of labels generated from B in the context of C. Set of slices where slice is defined as a set of rows of B all containing the same label. A random label would show no difference in the slices. (In practice this is implemented as an index instead of the literal data slice).<br />
<br />
Required Arguments: Matrix (B), label<br />
<br />
Optional Arguments: set to TRUE to compute the FDR<br />
<br />
Output: Value of test for every column of B<br />
<br />
<pre><br />
#In practice this is implemented as a index<br />
#Additional discriminator functions that use KS test, hypergeometric<br />
#distribution, etc are possible<br />
discriminator<-function(mat=B, testLabel=l, multCorrect=FALSE){<br />
<br />
#Create empty list<br />
ttest_struct<-list();<br />
<br />
#Compute value of test for each row<br />
for(i in 1:nrow(X)){<br />
temp<-t.test(as.numeric(B[i,])~label)<br />
ttest_struct[[i]]=temp;<br />
}<br />
<br />
#Extract pvalue from structure<br />
all_pval<-getPvalfromStruct(ttest_struct);<br />
<br />
#Test if we want to compute FDR<br />
if (multcorrect) { all_pval<-p.adjust(all_pval, method="fdr") }<br />
<br />
return (all_pval)<br />
}<br />
<br />
<br />
</pre><br />
<br />
<br />
==AUXILLIARY FUNCTIONS==<br />
<br />
===Checking against Threshold===<br />
<pre><br />
Returns a value to split on for the base case<br />
getThreshold<-function(A, type=c("median","mean")) {<br />
type <- match.arg(type)<br />
threshold <- switch(type,<br />
mean = mean(A),<br />
median = median(A) <br />
)<br />
return (threshold)<br />
}<br />
<br />
</pre><br />
<br />
===Return t-test Object===<br />
<pre><br />
Returns only the t-test from the ttest object<br />
getTfromStruct<-function(struct){<br />
p<-unlist(lapply(struct, "[",1));<br />
return(p)<br />
}<br />
</pre><br />
<br />
===Return p Value Object===<br />
<pre><br />
Returns only the pvalues from the ttest object<br />
getPvalfromStruct<-function(struct){<br />
p<-unlist(lapply(struct, "[",3));<br />
return(p)<br />
}<br />
</pre></div>Tarahttp://info.gersteinlab.org/CRIT/workflowCRIT/workflow2011-01-24T20:02:15Z<p>Tara: Created page with '==Under Construction=='</p>
<hr />
<div>==Under Construction==</div>Tarahttp://info.gersteinlab.org/CRIT/CRIT/2011-01-24T19:33:38Z<p>Tara: /* AUXILLIARY FUNCTIONS */</p>
<hr />
<div>=Cross Pattern Identification Technique (CRIT)=<br />
<br />
Label, Slice, Discriminate, Repeat.<br />
<br />
==Core Functions and Their Parameters==<br />
<br />
===Initializer===<br />
Initializer: Only run the first time to obtain some set of labels for the rows of A. In all remaining steps, the discriminator generates the new label for propogation. Alternatively, this step can be skipped and a label can be supplied directly to the labeler function.<br />
<br />
Required Arguments: Column of A, type of partitioning<br />
<br />
Output: Vector assigning a label to every ROW of Column of A<br />
<br />
<pre><br />
initializer<-function(A, type=c("median","mean")) {<br />
#Create empty vector with the same number of rows of A<br />
label<-matrix(0, length(A))<br />
#Get value to threshold off of<br />
t<-getThreshold(A, type=type)<br />
#Create label<br />
label<-labelSlicer(A ,t) <br />
return (label)<br />
}<br />
<br />
</pre><br />
<br />
<br />
===Labeler===<br />
In implementation both labeler and slicer (labelSlicer) are integrated as the label is a simple column vector. We show the breakdown to make the connection between the algorithm design and implementation more transparent.<br />
<br />
Labeler: Transfers labels on columns of previous datasets (A) to rows of new dataset (B)<br />
<br />
Required Arguments: Matrix (B), vector assigning a label to every ROW of A<br />
<br />
Output: Vector assigning a label to every COLUMN of B<br />
<br />
===Slicer===<br />
Slicer: Partition rows of N into "slices" based on labels from A<br />
<br />
Required Arguments: Matrix (B), Vector assigning a label to every COLUMN of B<br />
<br />
Output: Set of slices where slice is defined as a set of rows of B all containing the same label (in practice this is just the grouping variable as opposed to actual data slices)<br />
<br />
<pre><br />
labelSlicer<-function(values, t){<br />
#Set those strictly greater than threshold to 1<br />
index<-which(values>=t)<br />
label(index)<-1<br />
return (label) <br />
}<br />
</pre><br />
<br />
<br />
===Discriminator===<br />
Discriminator: Evaluates the discriminatory power of labels generated from B in the context of C. Set of slices where slice is defined as a set of rows of B all containing the same label. A random label would show no difference in the slices. (In practice this is implemented as an index instead of the literal data slice).<br />
<br />
Required Arguments: Matrix (B), label<br />
<br />
Optional Arguments: set to TRUE to compute the FDR<br />
<br />
Output: Value of test for every column of B<br />
<br />
<pre><br />
#In practice this is implemented as a index<br />
#Additional discriminator functions that use KS test, hypergeometric<br />
#distribution, etc are possible<br />
discriminator<-function(mat=B, testLabel=l, multCorrect=FALSE){<br />
<br />
#Create empty list<br />
ttest_struct<-list();<br />
<br />
#Compute value of test for each row<br />
for(i in 1:nrow(X)){<br />
temp<-t.test(as.numeric(B[i,])~label)<br />
ttest_struct[[i]]=temp;<br />
}<br />
<br />
#Extract pvalue from structure<br />
all_pval<-getPvalfromStruct(ttest_struct);<br />
<br />
#Test if we want to compute FDR<br />
if (multcorrect) { all_pval<-p.adjust(all_pval, method="fdr") }<br />
<br />
return (all_pval)<br />
}<br />
<br />
<br />
</pre><br />
<br />
<br />
==AUXILLIARY FUNCTIONS==<br />
<br />
===Checking against Threshold===<br />
<pre><br />
#Returns a value to split on for the base case<br />
getThreshold<-function(A, type=c("median","mean")) {<br />
type <- match.arg(type)<br />
threshold <- switch(type,<br />
mean = mean(A),<br />
median = median(A) <br />
)<br />
return (threshold)<br />
}<br />
<br />
</pre><br />
<br />
===Return t-test Object===<br />
<pre><br />
#Returns only the t-test from the ttest object<br />
getTfromStruct<-function(struct){<br />
p<-unlist(lapply(struct, "[",1));<br />
return(p)<br />
}<br />
</pre><br />
<br />
===Return p Value Object===<br />
<pre><br />
#Returns only the pvalues from the ttest object<br />
getPvalfromStruct<-function(struct){<br />
p<-unlist(lapply(struct, "[",3));<br />
return(p)<br />
}<br />
</pre></div>Tarahttp://info.gersteinlab.org/CRIT/CRIT/2011-01-24T19:32:02Z<p>Tara: </p>
<hr />
<div>=Cross Pattern Identification Technique (CRIT)=<br />
<br />
Label, Slice, Discriminate, Repeat.<br />
<br />
==Core Functions and Their Parameters==<br />
<br />
===Initializer===<br />
Initializer: Only run the first time to obtain some set of labels for the rows of A. In all remaining steps, the discriminator generates the new label for propogation. Alternatively, this step can be skipped and a label can be supplied directly to the labeler function.<br />
<br />
Required Arguments: Column of A, type of partitioning<br />
<br />
Output: Vector assigning a label to every ROW of Column of A<br />
<br />
<pre><br />
initializer<-function(A, type=c("median","mean")) {<br />
#Create empty vector with the same number of rows of A<br />
label<-matrix(0, length(A))<br />
#Get value to threshold off of<br />
t<-getThreshold(A, type=type)<br />
#Create label<br />
label<-labelSlicer(A ,t) <br />
return (label)<br />
}<br />
<br />
</pre><br />
<br />
<br />
===Labeler===<br />
In implementation both labeler and slicer (labelSlicer) are integrated as the label is a simple column vector. We show the breakdown to make the connection between the algorithm design and implementation more transparent.<br />
<br />
Labeler: Transfers labels on columns of previous datasets (A) to rows of new dataset (B)<br />
<br />
Required Arguments: Matrix (B), vector assigning a label to every ROW of A<br />
<br />
Output: Vector assigning a label to every COLUMN of B<br />
<br />
===Slicer===<br />
Slicer: Partition rows of N into "slices" based on labels from A<br />
<br />
Required Arguments: Matrix (B), Vector assigning a label to every COLUMN of B<br />
<br />
Output: Set of slices where slice is defined as a set of rows of B all containing the same label (in practice this is just the grouping variable as opposed to actual data slices)<br />
<br />
<pre><br />
labelSlicer<-function(values, t){<br />
#Set those strictly greater than threshold to 1<br />
index<-which(values>=t)<br />
label(index)<-1<br />
return (label) <br />
}<br />
</pre><br />
<br />
<br />
===Discriminator===<br />
Discriminator: Evaluates the discriminatory power of labels generated from B in the context of C. Set of slices where slice is defined as a set of rows of B all containing the same label. A random label would show no difference in the slices. (In practice this is implemented as an index instead of the literal data slice).<br />
<br />
Required Arguments: Matrix (B), label<br />
<br />
Optional Arguments: set to TRUE to compute the FDR<br />
<br />
Output: Value of test for every column of B<br />
<br />
<pre><br />
#In practice this is implemented as a index<br />
#Additional discriminator functions that use KS test, hypergeometric<br />
#distribution, etc are possible<br />
discriminator<-function(mat=B, testLabel=l, multCorrect=FALSE){<br />
<br />
#Create empty list<br />
ttest_struct<-list();<br />
<br />
#Compute value of test for each row<br />
for(i in 1:nrow(X)){<br />
temp<-t.test(as.numeric(B[i,])~label)<br />
ttest_struct[[i]]=temp;<br />
}<br />
<br />
#Extract pvalue from structure<br />
all_pval<-getPvalfromStruct(ttest_struct);<br />
<br />
#Test if we want to compute FDR<br />
if (multcorrect) { all_pval<-p.adjust(all_pval, method="fdr") }<br />
<br />
return (all_pval)<br />
}<br />
<br />
<br />
</pre><br />
<br />
<br />
==AUXILLIARY FUNCTIONS==<br />
<br />
#Returns a value to split on for the base case<br />
getThreshold<-function(A, type=c("median","mean")) {<br />
type <- match.arg(type)<br />
threshold <- switch(type,<br />
mean = mean(A),<br />
median = median(A) <br />
)<br />
return (threshold)<br />
}<br />
<br />
#Returns only the t-test from the ttest object<br />
getTfromStruct<-function(struct){<br />
p<-unlist(lapply(struct, "[",1));<br />
return(p)<br />
}<br />
<br />
#Returns only the pvalues from the ttest object<br />
getPvalfromStruct<-function(struct){<br />
p<-unlist(lapply(struct, "[",3));<br />
return(p)<br />
}</div>Tarahttp://info.gersteinlab.org/CRIT/CRIT/2011-01-24T19:31:35Z<p>Tara: </p>
<hr />
<div>=Cross Pattern Identification Technique (CRIT)=<br />
<br />
Label, Slice, Discriminate, Repeat.<br />
<br />
==Core Functions and Their Parameters==<br />
<br />
===Initializer===<br />
Initializer: Only run the first time to obtain some set of labels for the rows of A. In all remaining steps, the discriminator generates the new label for propogation. Alternatively, this step can be skipped and a label can be supplied directly to the labeler function.<br />
<br />
Required Arguments: Column of A, type of partitioning<br />
<br />
Output: Vector assigning a label to every ROW of Column of A<br />
<br />
<pre><br />
initializer<-function(A, type=c("median","mean")) {<br />
#Create empty vector with the same number of rows of A<br />
label<-matrix(0, length(A))<br />
#Get value to threshold off of<br />
t<-getThreshold(A, type=type)<br />
#Create label<br />
label<-labelSlicer(A ,t) <br />
return (label)<br />
}<br />
<br />
</pre><br />
<br />
<br />
===Labeler===<br />
In implementation both labeler and slicer (labelSlicer) are integrated as the label is a simple column vector. We show the breakdown to make the connection between the algorithm design and implementation more transparent.<br />
<br />
Labeler: Transfers labels on columns of previous datasets (A) to rows of new dataset (B)<br />
<br />
Required Arguments: Matrix (B), vector assigning a label to every ROW of A<br />
<br />
Output: Vector assigning a label to every COLUMN of B<br />
<br />
===Slicer===<br />
Slicer: Partition rows of N into "slices" based on labels from A<br />
<br />
Required Arguments: Matrix (B), Vector assigning a label to every COLUMN of B<br />
<br />
Output: Set of slices where slice is defined as a set of rows of B all containing the same label (in practice this is just the grouping variable as opposed to actual data slices)<br />
<br />
<pre><br />
labelSlicer<-function(values, t){<br />
#Set those strictly greater than threshold to 1<br />
index<-which(values>=t)<br />
label(index)<-1<br />
return (label) <br />
}<br />
</pre><br />
<br />
<br />
===Discriminator===<br />
Discriminator: Evaluates the discriminatory power of labels generated from B in the context of C. Set of slices where slice is defined as a set of rows of B all containing the same label. A random label would show no difference in the slices. (In practice this is implemented as an index instead of the literal data slice).<br />
<br />
Required Arguments: Matrix (B), label<br />
<br />
Optional Arguments: set to TRUE to compute the FDR<br />
<br />
Output: Value of test for every column of B<br />
<br />
<pre><br />
#In practice this is implemented as a index<br />
#Additional discriminator functions that use KS test, hypergeometric<br />
#distribution, etc are possible<br />
discriminator<-function(mat=B, testLabel=l, multCorrect=FALSE){<br />
<br />
#Create empty list<br />
ttest_struct<-list();<br />
<br />
#Compute value of test for each row<br />
for(i in 1:nrow(X)){<br />
temp<-t.test(as.numeric(B[i,])~label)<br />
ttest_struct[[i]]=temp;<br />
}<br />
<br />
#Extract pvalue from structure<br />
all_pval<-getPvalfromStruct(ttest_struct);<br />
<br />
#Test if we want to compute FDR<br />
if (multcorrect) { all_pval<-p.adjust(all_pval, method="fdr") }<br />
<br />
return (all_pval)<br />
}<br />
<br />
<br />
</pre><br />
<br />
<br />
===AUXILLIARY FUNCTIONS===<br />
<br />
#Returns a value to split on for the base case<br />
getThreshold<-function(A, type=c("median","mean")) {<br />
type <- match.arg(type)<br />
threshold <- switch(type,<br />
mean = mean(A),<br />
median = median(A) <br />
)<br />
return (threshold)<br />
}<br />
<br />
#Returns only the t-test from the ttest object<br />
getTfromStruct<-function(struct){<br />
p<-unlist(lapply(struct, "[",1));<br />
return(p)<br />
}<br />
<br />
#Returns only the pvalues from the ttest object<br />
getPvalfromStruct<-function(struct){<br />
p<-unlist(lapply(struct, "[",3));<br />
return(p)<br />
}</div>Tarahttp://info.gersteinlab.org/CRIT/CRIT/2011-01-24T19:27:44Z<p>Tara: Created page with '=Cross Pattern Identification Technique (CRIT)= ==Core Functions and Their Parameters== ===Initializer=== Initializer: Only run the first time to obtain some set of labels for …'</p>
<hr />
<div>=Cross Pattern Identification Technique (CRIT)=<br />
<br />
==Core Functions and Their Parameters==<br />
<br />
===Initializer===<br />
Initializer: Only run the first time to obtain some set of labels for the rows of A. In all remaining steps, the discriminator generates the new label for propogation. Alternatively, this step can be skipped and a label can be supplied directly to the labeler function.<br />
<br />
Required Arguments: Column of A, type of partitioning<br />
<br />
Output: Vector assigning a label to every ROW of Column of A<br />
<br />
===Labeler===<br />
In implementation both labeler and slicer (labelSlicer) are integrated as the label is a simple column vector. We show the breakdown to make the connection between the algorithm design and implementation more transparent.<br />
<br />
Labeler: Transfers labels on columns of previous datasets (A) to rows of new dataset (B)<br />
<br />
Required Arguments: Matrix (B), vector assigning a label to every ROW of A<br />
<br />
Output: Vector assigning a label to every COLUMN of B<br />
<br />
===Slicer===<br />
Slicer: Partition rows of N into "slices" based on labels from A<br />
<br />
Required Arguments: Matrix (B), Vector assigning a label to every COLUMN of B<br />
<br />
Output: Set of slices where slice is defined as a set of rows of B all containing the same label (in practice this is just the grouping variable as opposed to actual data slices)<br />
<br />
===Discriminator===<br />
Discriminator: Evaluates the discriminatory power of labels generated from B in the context of C. Set of slices where slice is defined as a set of rows of B all containing the same label. A random label would show no difference in the slices. (In practice this is implemented as an index instead of the literal data slice).<br />
<br />
Required Arguments: Matrix (B), label<br />
<br />
Optional Arguments: set to TRUE to compute the FDR<br />
<br />
Output: Value of test for every column of B<br />
<br />
#Label, Slice, Discriminate, Repeat<br />
<br />
#---------------CORE FUNCTIONS---------------#<br />
<br />
initializer<-function(A, type=c("median","mean")) {<br />
#Create empty vector with the same number of rows of A<br />
label<-matrix(0, length(A))<br />
#Get value to threshold off of<br />
t<-getThreshold(A, type=type)<br />
#Create label<br />
label<-labelSlicer(A ,t) <br />
return (label)<br />
}<br />
<br />
#In practice this is implemented as a index<br />
#Additional discriminator functions that use KS test, hypergeometric<br />
#distribution, etc are possible<br />
discriminator<-function(mat=B, testLabel=l, multCorrect=FALSE){<br />
<br />
#Create empty list<br />
ttest_struct<-list();<br />
<br />
#Compute value of test for each row<br />
for(i in 1:nrow(X)){<br />
temp<-t.test(as.numeric(B[i,])~label)<br />
ttest_struct[[i]]=temp;<br />
}<br />
<br />
#Extract pvalue from structure<br />
all_pval<-getPvalfromStruct(ttest_struct);<br />
<br />
#Test if we want to compute FDR<br />
if (multcorrect) { all_pval<-p.adjust(all_pval, method="fdr") }<br />
<br />
return (all_pval)<br />
}<br />
<br />
labelSlicer<-function(values, t){<br />
#Set those strictly greater than threshold to 1<br />
index<-which(values>=t)<br />
label(index)<-1<br />
return (label) <br />
}<br />
<br />
<br />
<br />
#---------------AUXILLIARY FUNCTIONS---------------#<br />
<br />
#Returns a value to split on for the base case<br />
getThreshold<-function(A, type=c("median","mean")) {<br />
type <- match.arg(type)<br />
threshold <- switch(type,<br />
mean = mean(A),<br />
median = median(A) <br />
)<br />
return (threshold)<br />
}<br />
<br />
#Returns only the t-test from the ttest object<br />
getTfromStruct<-function(struct){<br />
p<-unlist(lapply(struct, "[",1));<br />
return(p)<br />
}<br />
<br />
#Returns only the pvalues from the ttest object<br />
getPvalfromStruct<-function(struct){<br />
p<-unlist(lapply(struct, "[",3));<br />
return(p)<br />
}</div>Tarahttp://info.gersteinlab.org/CRIT/galleryCRIT/gallery2011-01-24T19:12:12Z<p>Tara: Created page with '==Coming Soon!=='</p>
<hr />
<div>==Coming Soon!==</div>Tarahttp://info.gersteinlab.org/CRIT/codeCRIT/code2011-01-24T19:10:06Z<p>Tara: Created page with '<center>[http://archive.gersteinlab.org/proj/crit/ '''CRIT Main Page''']</center> __NOTOC__ == Code == === Required Software - External === # This is an R package. <br> ===…'</p>
<hr />
<div><center>[http://archive.gersteinlab.org/proj/crit/ '''CRIT Main Page''']</center><br />
<br />
__NOTOC__<br />
<br />
== Code ==<br />
<br />
=== Required Software - External ===<br />
<br />
# This is an R package.<br />
<br />
<br><br />
<br />
=== Download ===<br />
<br />
==== Source code ====<br />
A TAR ball that contains the source code for these two components can be downloaded here:<br />
* [http://archive.gersteinlab.org/proj/crit CRIT-0.1.tar.gz]<br />
<br />
<pre><br />
Important Note<br />
==============<br />
<br />
THIS PACKAGE (RSEQtools including BIOS and MRF) IS PROVIDED "AS IS" AND WITHOUT ANY EXPRESSED OR IMPLIED<br />
WARRANTIES, INCLUDING, WITHOUT LIMITATION, THE IMPLIED WARRANTIES<br />
OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE.<br />
</pre><br />
<br />
<br><br />
<br />
=== License information ===<br />
<br />
The software package is released under the [http://creativecommons.org/licenses/by-nc/2.5/legalcode Creative Commons license (Attribution-NonCommerical)]. <br><br />
For more details please refer to the [http://www.gersteinlab.org/misc/permissions.html Permissions Page] on the Gerstein Lab webpage.</div>Tara