Report

Level	Rule Name	Subject	Property	Value
WARN	annotation_whitespace	BFO:0000006	IAO:0000602	(forall (x y t) (if (and (SpatialRegion x) (continuantPartOfAt y x t)) (SpatialRegion y))) // axiom label in BFO2 CLIF: [036-001]
WARN	annotation_whitespace	BFO:0000006	IAO:0000602	(forall (x) (if (SpatialRegion x) (Continuant x))) // axiom label in BFO2 CLIF: [035-001]
WARN	annotation_whitespace	BFO:0000009	IAO:0000602	(forall (x) (if (TwoDimensionalSpatialRegion x) (SpatialRegion x))) // axiom label in BFO2 CLIF: [039-001]
WARN	annotation_whitespace	BFO:0000016	IAO:0000602	(forall (x t) (if (and (RealizableEntity x) (existsAt x t)) (exists (y) (and (MaterialEntity y) (specificallyDepends x y t))))) // axiom label in BFO2 CLIF: [063-002]
WARN	annotation_whitespace	BFO:0000016	IAO:0000602	(forall (x) (if (Disposition x) (and (RealizableEntity x) (exists (y) (and (MaterialEntity y) (bearerOfAt x y t)))))) // axiom label in BFO2 CLIF: [062-002]
WARN	annotation_whitespace	BFO:0000018	IAO:0000602	(forall (x) (if (ZeroDimensionalSpatialRegion x) (SpatialRegion x))) // axiom label in BFO2 CLIF: [037-001]
WARN	annotation_whitespace	BFO:0000019	IAO:0000602	(forall (x) (if (Quality x) (SpecificallyDependentContinuant x))) // axiom label in BFO2 CLIF: [055-001]
WARN	annotation_whitespace	BFO:0000019	IAO:0000602	(forall (x) (if (exists (t) (and (existsAt x t) (Quality x))) (forall (t_1) (if (existsAt x t_1) (Quality x))))) // axiom label in BFO2 CLIF: [105-001]
WARN	annotation_whitespace	BFO:0000023	IAO:0000602	(forall (x) (if (Role x) (RealizableEntity x))) // axiom label in BFO2 CLIF: [061-001]
WARN	annotation_whitespace	BFO:0000026	IAO:0000602	(forall (x) (if (OneDimensionalSpatialRegion x) (SpatialRegion x))) // axiom label in BFO2 CLIF: [038-001]
WARN	annotation_whitespace	BFO:0000028	IAO:0000602	(forall (x) (if (ThreeDimensionalSpatialRegion x) (SpatialRegion x))) // axiom label in BFO2 CLIF: [040-001]
WARN	annotation_whitespace	BFO:0000031	IAO:0000602	(iff (GenericallyDependentContinuant a) (and (Continuant a) (exists (b t) (genericallyDependsOnAt a b t)))) // axiom label in BFO2 CLIF: [074-001]
WARN	annotation_whitespace	BFO:0000034	IAO:0000602	(forall (x) (if (Function x) (Disposition x))) // axiom label in BFO2 CLIF: [064-001]
WARN	annotation_whitespace	BFO:0000040	IAO:0000602	(forall (x) (if (MaterialEntity x) (IndependentContinuant x))) // axiom label in BFO2 CLIF: [019-002]
WARN	annotation_whitespace	BFO:0000040	IAO:0000602	(forall (x) (if (and (Entity x) (exists (y t) (and (MaterialEntity y) (continuantPartOfAt x y t)))) (MaterialEntity x))) // axiom label in BFO2 CLIF: [021-002]
WARN	annotation_whitespace	BFO:0000040	IAO:0000602	(forall (x) (if (and (Entity x) (exists (y t) (and (MaterialEntity y) (continuantPartOfAt y x t)))) (MaterialEntity x))) // axiom label in BFO2 CLIF: [020-002]
WARN	annotation_whitespace	IAO:0000009	IAO:0000232	"9/22/11 BP: changed the rdfs:label for this class from 'label' to 'datum label' to convey that this class is not intended to cover all kinds of labels (stickers, radiolabels, etc.), and not even all kind of textual labels, but rather the kind of labels occuring in a datum.
"
WARN	annotation_whitespace	IAO:0000115	IAO:0000116	"2012-04-05:
Barry Smith

The official OBI definition, explaining the meaning of a class or property: 'Shall be Aristotelian, formalized and normalized. Can be augmented with colloquial definitions' is terrible.

Can you fix to something like:

A statement of necessary and sufficient conditions explaining the meaning of an expression referring to a class or property.

Alan Ruttenberg

Your proposed definition is a reasonable candidate, except that it is very common that necessary and sufficient conditions are not given. Mostly they are necessary, occasionally they are necessary and sufficient or just sufficient. Often they use terms that are not themselves defined and so they effectively can't be evaluated by those criteria.

On the specifics of the proposed definition:

We don't have definitions of 'meaning' or 'expression' or 'property'. For 'reference' in the intended sense I think we use the term 'denotation'. For 'expression', I think we you mean symbol, or identifier. For 'meaning' it differs for class and property. For class we want documentation that let's the intended reader determine whether an entity is instance of the class, or not. For property we want documentation that let's the intended reader determine, given a pair of potential relata, whether the assertion that the relation holds is true. The 'intended reader' part suggests that we also specify who, we expect, would be able to understand the definition, and also generalizes over human and computer reader to include textual and logical definition.

Personally, I am more comfortable weakening definition to documentation, with instructions as to what is desirable.

We also have the outstanding issue of how to aim different definitions to different audiences. A clinical audience reading chebi wants a different sort of definition documentation/definition from a chemistry trained audience, and similarly there is a need for a definition that is adequate for an ontologist to work with. @en"
WARN	annotation_whitespace	IAO:0000578	IAO:0000116	"Alan, IAO call 20101124: potentially the CRID denotes the instance it was associated with during creation.
@en"
WARN	annotation_whitespace	IAO:0000579	IAO:0000112	PubMed is a CRID registry. It has a dataset of PubMed identifiers associated with journal articles. @en
WARN	annotation_whitespace	OBI:0000070	IAO:0000116	12/3/12: BP: the reference to the 'physical examination' is included to point out that a prediction is not an assay, as that does not require physical examiniation.
WARN	annotation_whitespace	OBI:0000086	IAO:0000232	"Feb 10, 2009. changes after discussion at OBI Consortium Workshop Feb 2-6, 2009. accepted as core term.

May 28 2013. Updated definition taken from ReO based on discussions initiated in Philly 2011 workshop. Former defnition described a narrower view of reagents in chemistry that restricts bearers of the role to be chemical entities (\""a role played by a molecular entity used to produce a chemical reaction to detect, measure, or produce other substances\""). Updated definition allows for broader view of reagents in the domain of biomedical research to include larger materials that have parts that participate chemically in a molecular reaction or interaction.
"
WARN	annotation_whitespace	OBI:0000097	IAO:0000232	"Following OBI call November 2012,26th:

1. it was decided there was no need for moving the children class and making them siblings of study subject role.
2. it also settles the disambiguation about 'study subject'. This is about the individual participating in the investigation/study, Not the 'topic' (as in 'toxicity study') of the investigation/study

This note closes the issue and validates the class definition to be part of the OBI core
editor = PRS
@en"
WARN	annotation_whitespace	OBI:0000124	IAO:0000112	"The relation between the conclusion \""Gene tpbA is involved in EPS production\"" and the data items produced using two sets of organisms, one being a tpbA knockout, the other being tpbA wildtype tested in polysacharide production assays and analyzed using an ANOVA. "
WARN	annotation_whitespace	OBI:0000250	IAO:0000115	A molecular label role which inheres in a material entity and which is realized in the process of detecting a molecular dye that imparts color to some material of interest.
WARN	annotation_whitespace	OBI:0000338	IAO:0000112	Concluding that a gene is upregulated in a tissue sample based on the band intensity in a western blot. Concluding that a patient has a infection based on measurement of an elevated body temperature and reported headache. Concluding that there were problems in an investigation because data from PCR and microarray are conflicting. Concluding that 'defects in gene XYZ cause cancer due to improper DNA repair' based on data from experiments in that study that gene XYZ is involved in DNA repair, and the conclusion of a previous study that cancer patients have an increased number of mutations in this gene.
WARN	annotation_whitespace	OBI:0000339	IAO:0000116	7/18/2011 BP: planning used to itself be a planned process. Barry Smith pointed out that this would lead to an infinite regression, as there would have to be a plan to conduct a planning process, which in itself would be the result of planning etc. Therefore, the restrictions on 'planning' were loosened to allow for informal processes that result in an 'ad hoc plan '. This required changing from 'has_specified_output some plan specifiction' to 'has_participant some plan specification'.
WARN	annotation_whitespace	OBI:0000675	IAO:0000115	is a data transformation objective where the aim is to estimate statistical significance with the aim of proving or disproving a hypothesis by means of some data transformation
WARN	annotation_whitespace	OBI:0000751	IAO:0000112	In a study in which gene expression is measured in patients between 8 month to 4 years old that have mild or severe malaria and in which the hypothesis is that gene expression in that age group is a function of disease status, the gene expression is the dependent variable.
WARN	annotation_whitespace	OBI:0000938	IAO:0000119	"Bjoern Peters
"
WARN	annotation_whitespace	OBI:0000963	IAO:0000112	The labels 'positive' vs. 'negative', or 'left handed', 'right handed', 'ambidexterous', or 'strongly binding', 'weakly binding' , 'not binding', or '+++', '++', '+', '-' etc. form scales of categorical labels.
WARN	annotation_whitespace	OBI:0000968	IAO:0000116	"2012-12-17 JAO: In common lab usage, there is a distinction made between devices and reagents that is difficult to model. Therefore we have chosen to specifically exclude reagents from the definition of \""device\"", and are enumerating the types of roles that a reagent can perform.

2013-6-5 MHB: The following clarifications are outcomes of the May 2013 Philly Workshop. Reagents are distinguished from devices that also participate in scientific techniques by the fact that reagents are chemical or biological in nature and necessarily participate in some chemical interaction or reaction during the realization of their experimental role. By contrast, devices do not participate in such chemical reactions/interactions. Note that there are cases where devices use reagent components during their operation, where the reagent-device distinction is less clear. For example:

(1) An HPLC machine is considered a device, but has a column that holds a stationary phase resin as an operational component. This resin qualifies as a device if it participates purely in size exclusion, but bears a reagent role that is realized in the running of a column if it interacts electrostatically or chemically with the evaluant. The container the resin is in (“the column”) considered alone is a device. So the entire column as well as the entire HPLC machine are devices that have a reagent as an operating part.

(2) A pH meter is a device, but its electrode component bears a reagent role in virtue of its interacting directly with the evaluant in execution of an assay.

(3) A gel running box is a device that has a metallic lead as a component that participates in a chemical reaction with the running buffer when a charge is passed through it. This metallic lead is considered to have a reagent role as a component of this device realized in the running of a gel.

In the examples above, a reagent is an operational component of a device, but the device itself does not realize a reagent role (as bearing a reagent role is not transitive across the part_of relation). In this way, the asserted disjointness between a reagent and device holds, as both roles are never realized in the same bearer during execution of an assay. "
WARN	annotation_whitespace	OBI:0000973	IAO:0000115	"A measurement datum that representing the primary structure of a macromolecule(it's sequence) sometimes associated with an indicator of confidence of that measurement.
"
WARN	annotation_whitespace	OBI:0001172	IAO:0000115	A data item of paired values, one indicating the dose of a material, the other quantitating a measured effect at that dose. The dosing intervals are chosen so that effect values be interpolated by a plotting a curve.
WARN	annotation_whitespace	OBI:0001265	IAO:0000119	adapted from wikipedia (http://en.wikipedia.org/wiki/Familywise_error_rate) @en
WARN	annotation_whitespace	OBI:0001404	IAO:0000232	"MO definition:
The genotype of the individual organism from which the biomaterial was derived. Individual genetic characteristics include polymorphisms, disease alleles, and haplotypes.

examples in ArrayExpress
wild_type
MutaMouse (CD2F1 mice with lambda-gt10LacZ integration)
AlfpCre; SNF5 flox/knockout
p53 knock out
C57Bl/6 gp130lox/lox MLC2vCRE/+
fer-15; fem-1
df/df
pat1-114/pat1-114 ade6-M210/ade6-M216 h+/h+ (cells are diploid)
@en"
WARN	annotation_whitespace	OBI:0001573	IAO:0000112	The part of a FASTA file that contains the letters ACTGGGAA
WARN	annotation_whitespace	OBI:0001573	IAO:0000232	8/29/11 call: This is added after a request from Melanie and Yu. They should review it further. This should be a child of 'sequence data', and as of the current definition will infer there.
WARN	annotation_whitespace	OBI:0001834	IAO:0000112	"Concluding that the length of the hypotenuse is equal to the square root of the sum of squares of the other two sides in a right-triangle.
Concluding that a gene is upregulated in a tissue sample based on the band intensity in a western blot. Concluding that a patient has a infection based on measurement of an elevated body temperature and reported headache. Concluding that there were problems in an investigation because data from PCR and microarray are conflicting.
"
WARN	annotation_whitespace	OBI:0001909	IAO:0000116	"In the Philly 2013 workshop, we recognized the limitations of \""conclusion textual entity\"", and we introduced this as more general. The need for the 'textual entity' term going forward is up for future debate. "
WARN	annotation_whitespace	OBI:0001912	IAO:0000115	A processed material that serves as a liquid vehicle for freezing cells for long term quiescent stroage, which contains chemicls needed to sustain cell viability across freeze-thaw cycles.
WARN	annotation_whitespace	RO:0001901	IAO:0000115	"

## Elucidation

This is used when the statement/axiom is assumed to hold true 'eternally'

## How to interpret (informal)

First the \""atemporal\"" FOL is derived from the OWL using the standard
interpretation. This axiom is temporalized by embedding the axiom
within a for-all-times quantified sentence. The t argument is added to
all instantiation predicates and predicates that use this relation.

## Example

Class: nucleus
SubClassOf: part_of some cell

forall t :
forall n :
instance_of(n,Nucleus,t)
implies
exists c :
instance_of(c,Cell,t)
part_of(n,c,t)

## Notes

This interpretation is not the same as an at-all-times relation

"
WARN	duplicate_exact_synonym	STATO:0000636	IAO:0000118	NNS
WARN	duplicate_exact_synonym	STATO:0000637	IAO:0000118	NNS
WARN	equivalent_class_axiom_no_genus	OBI:0000070	OBI:0000417	OBI:0000441
WARN	equivalent_class_axiom_no_genus	OBI:0000094	OBI:0000417	OBI:0000456
WARN	equivalent_class_axiom_no_genus	OBI:0000274	OBI:0000417	OBI:0000434
WARN	equivalent_class_axiom_no_genus	OBI:0000435	OBI:0000299	OBI:0001305
WARN	equivalent_class_axiom_no_genus	OBI:0000443	OBI:0000417	OBI:0000437
WARN	equivalent_class_axiom_no_genus	OBI:0000451	OBI:0000312	OBI:0200169
WARN	equivalent_class_axiom_no_genus	OBI:0000648	OBI:0000312	OBI:0200175
WARN	equivalent_class_axiom_no_genus	OBI:0000649	OBI:0000312	OBI:0200184
WARN	equivalent_class_axiom_no_genus	OBI:0000650	OBI:0000417	OBI:0200031
WARN	equivalent_class_axiom_no_genus	OBI:0000652	OBI:0000417	OBI:0000686
WARN	equivalent_class_axiom_no_genus	OBI:0000662	OBI:0000312	OBI:0000668
WARN	equivalent_class_axiom_no_genus	OBI:0000668	OBI:0000417	OBI:0200186
WARN	equivalent_class_axiom_no_genus	OBI:0000673	OBI:0000417	OBI:0000675
WARN	equivalent_class_axiom_no_genus	OBI:0000674	OBI:0000312	OBI:0200181
WARN	equivalent_class_axiom_no_genus	OBI:0000679	OBI:0000312	OBI:0200170
WARN	equivalent_class_axiom_no_genus	OBI:0000838	OBI:0000417	OBI:0000806
WARN	equivalent_class_axiom_no_genus	OBI:0000932	RO:0000085	OBI:0000372
WARN	equivalent_class_axiom_no_genus	OBI:0000967	RO:0000085	OBI:0000370
WARN	equivalent_class_axiom_no_genus	OBI:0001032	RO:0000085	OBI:0000367
WARN	equivalent_class_axiom_no_genus	OBI:0001834	OBI:0000299	IAO:0000144
WARN	equivalent_class_axiom_no_genus	OBI:0002089	RO:0000085	OBI:0000401
WARN	equivalent_class_axiom_no_genus	OBI:0200000	OBI:0000417	OBI:0200166
WARN	equivalent_class_axiom_no_genus	OBI:0200073	OBI:0000299	OBI:0001265
WARN	equivalent_class_axiom_no_genus	OBI:0200089	OBI:0000417	OBI:0000791
WARN	equivalent_class_axiom_no_genus	OBI:0200163	OBI:0000299	OBI:0001442
WARN	equivalent_class_axiom_no_genus	OBI:0200171	OBI:0000417	OBI:0200172
WARN	equivalent_class_axiom_no_genus	OBI:0200194	OBI:0000417	OBI:0200083
WARN	equivalent_class_axiom_no_genus	OBI:0600014	OBI:0000417	OBI:0000639
WARN	equivalent_class_axiom_no_genus	OBI:1110108	RO:0000087	OBI:0000319
WARN	equivalent_class_axiom_no_genus	OBI:1110109	RO:0000087	OBI:0000444
WARN	equivalent_class_axiom_no_genus	STATO:0000027	OBI:0000417	STATO:0000121
WARN	equivalent_class_axiom_no_genus	STATO:0000033	OBI:0000312	OBI:0200117
WARN	equivalent_class_axiom_no_genus	STATO:0000085	OBI:0000295	STATO:0000175
WARN	equivalent_class_axiom_no_genus	STATO:0000119	OBI:0000299	STATO:0000144
WARN	equivalent_class_axiom_no_genus	STATO:0000131	OBI:0000417	STATO:0000183
WARN	equivalent_class_axiom_no_genus	STATO:0000133	BFO:0000062	OBI:0200201
WARN	equivalent_class_axiom_no_genus	STATO:0000137	OBI:0000417	STATO:0000226
WARN	equivalent_class_axiom_no_genus	STATO:0000191	OBI:0000417	STATO:0000224
WARN	equivalent_class_axiom_no_genus	STATO:0000202	OBI:0000417	STATO:0000253
WARN	equivalent_class_axiom_no_genus	STATO:0000247	OBI:0000417	STATO:0000173
WARN	equivalent_class_axiom_no_genus	STATO:0000279	OBI:0000417	STATO:0000255
WARN	equivalent_class_axiom_no_genus	STATO:0000337	OBI:0000299	STATO:0000485
WARN	equivalent_class_axiom_no_genus	STATO:0000443	OBI:0000417	STATO:0000439
WARN	equivalent_class_axiom_no_genus	STATO:0000471	STATO:0000403	STATO:0000039
WARN	equivalent_class_axiom_no_genus	STATO:0000697	OBI:0000417	STATO:0000173
WARN	equivalent_pair	OBI:0000674	owl:equivalentClass	STATO:0000574
WARN	equivalent_pair	OBI:0000679	owl:equivalentClass	STATO:0000573
WARN	equivalent_pair	STATO:0000247	owl:equivalentClass	STATO:0000697
WARN	missing_definition	BFO:0000062	IAO:0000115
WARN	missing_definition	BFO:0000063	IAO:0000115
WARN	missing_definition	BFO:0000141	IAO:0000115
WARN	missing_definition	IAO:0000004	IAO:0000115
WARN	missing_definition	IAO:0000039	IAO:0000115
WARN	missing_definition	IAO:0000114	IAO:0000115
WARN	missing_definition	IAO:0000404	IAO:0000115
WARN	missing_definition	IAO:0000406	IAO:0000115
WARN	missing_definition	IAO:0000407	IAO:0000115
WARN	missing_definition	IAO:0000582	IAO:0000115
WARN	missing_definition	NCBITaxon:10239	IAO:0000115
WARN	missing_definition	NCBITaxon:117571	IAO:0000115
WARN	missing_definition	NCBITaxon:2	IAO:0000115
WARN	missing_definition	NCBITaxon:2157	IAO:0000115
WARN	missing_definition	NCBITaxon:2759	IAO:0000115
WARN	missing_definition	NCBITaxon:314146	IAO:0000115
WARN	missing_definition	NCBITaxon:32523	IAO:0000115
WARN	missing_definition	NCBITaxon:32524	IAO:0000115
WARN	missing_definition	NCBITaxon:33154	IAO:0000115
WARN	missing_definition	NCBITaxon:33213	IAO:0000115
WARN	missing_definition	NCBITaxon:40674	IAO:0000115
WARN	missing_definition	NCBITaxon:7742	IAO:0000115
WARN	missing_definition	NCBITaxon:9606	IAO:0000115
WARN	missing_definition	RO:0001900	IAO:0000115
WARN	missing_definition	RO:0002222	IAO:0000115
WARN	missing_definition	obo:obi.owl	IAO:0000115
WARN	missing_definition	dc11:contributor	IAO:0000115
WARN	missing_definition	dc11:creator	IAO:0000115
WARN	missing_definition	dc11:date	IAO:0000115
WARN	missing_definition	dc11:description	IAO:0000115
WARN	missing_definition	dc11:format	IAO:0000115
WARN	missing_definition	dc11:source	IAO:0000115
WARN	missing_definition	dc11:subject	IAO:0000115
WARN	missing_definition	dc:license	IAO:0000115
WARN	multiple_equivalent_classes	OBI:0000674	owl:equivalentClass	STATO:0000574
WARN	multiple_equivalent_classes	OBI:0000674	owl:equivalentClass	blank node
WARN	multiple_equivalent_classes	OBI:0000679	owl:equivalentClass	STATO:0000573
WARN	multiple_equivalent_classes	OBI:0000679	owl:equivalentClass	blank node
WARN	multiple_equivalent_classes	STATO:0000247	owl:equivalentClass	STATO:0000697
WARN	multiple_equivalent_classes	STATO:0000247	owl:equivalentClass	blank node
INFO	lowercase_definition	BFO:0000016	IAO:0000600	b is a disposition means: b is a realizable entity & b’s bearer is some material entity & b is such that if it ceases to exist, then its bearer is physically changed, & b’s realization occurs when and because this bearer is in some special physical circumstances, & this realization occurs in virtue of the bearer’s physical make-up. (axiom label in BFO2 Reference: [062-002])@en
INFO	lowercase_definition	BFO:0000019	IAO:0000600	a quality is a specifically dependent continuant that, in contrast to roles and dispositions, does not require any further process in order to be realized. (axiom label in BFO2 Reference: [055-001])@en
INFO	lowercase_definition	BFO:0000023	IAO:0000600	b is a role means: b is a realizable entity & b exists because there is some single bearer that is in some special physical, social, or institutional set of circumstances in which this bearer does not have to be& b is not such that, if it ceases to exist, then the physical make-up of the bearer is thereby changed. (axiom label in BFO2 Reference: [061-001])@en
INFO	lowercase_definition	BFO:0000050	IAO:0000115	a core relation that holds between a part and its whole@en
INFO	lowercase_definition	BFO:0000051	IAO:0000115	a core relation that holds between a whole and its part@en
INFO	lowercase_definition	BFO:0000054	IAO:0000600	[copied from inverse property 'realizes'] to say that b realizes c at t is to assert that there is some material entity d & b is a process which has participant d at t & c is a disposition or role of which d is bearer_of at t& the type instantiated by b is correlated with the type instantiated by c. (axiom label in BFO2 Reference: [059-003])@en
INFO	lowercase_definition	BFO:0000055	IAO:0000600	to say that b realizes c at t is to assert that there is some material entity d & b is a process which has participant d at t & c is a disposition or role of which d is bearer_of at t& the type instantiated by b is correlated with the type instantiated by c. (axiom label in BFO2 Reference: [059-003])@en
INFO	lowercase_definition	IAO:0000001	IAO:0000115	a directive information entity that specifies what should happen if the trigger condition is fulfilled@en
INFO	lowercase_definition	IAO:0000005	IAO:0000115	a directive information entity that describes an intended process endpoint. When part of a plan specification the concretization is realized in a planned process in which the bearer tries to effect the world so that the process endpoint is achieved.@en
INFO	lowercase_definition	IAO:0000007	IAO:0000115	a directive information entity that describes an action the bearer will take@en
INFO	lowercase_definition	IAO:0000027	IAO:0000115	a data item is an information content entity that is intended to be a truthful statement about something (modulo, e.g., measurement precision or other systematic errors) and is constructed/acquired by a method which reliably tends to produce (approximately) truthful statements.@en
INFO	lowercase_definition	IAO:0000032	IAO:0000115	a scalar measurement datum is a measurement datum that is composed of two parts, numerals and a unit label.@en
INFO	lowercase_definition	IAO:0000055	IAO:0000115	a rule is an executable which guides, defines, restricts actions@en
INFO	lowercase_definition	IAO:0000102	IAO:0000115	data about an ontology part is a data item about a part of an ontology, for example a term@en
INFO	lowercase_definition	IAO:0000119	IAO:0000115	formal citation, e.g. identifier in external database to indicate / attribute source(s) for the definition. Free text indicate / attribute source(s) for the definition. EXAMPLE: Author Name, URI, MeSH Term C04, PUBMED ID, Wiki uri on 31.01.2007@en
INFO	lowercase_definition	IAO:0000121	IAO:0000115	term created to ease viewing/sort terms for development purpose, and will not be included in a release@en
INFO	lowercase_definition	IAO:0000136	IAO:0000115	is_about is a (currently) primitive relation that relates an information artifact to an entity.@en
INFO	lowercase_definition	IAO:0000219	IAO:0000115	denotes is a primitive, instance-level, relation obtaining between an information content entity and some portion of reality. Denotation is what happens when someone creates an information content entity E in order to specifically refer to something. The only relation between E and the thing is that E can be used to 'pick out' the thing. This relation connects those two together. Freedictionary.com sense 3: To signify directly; refer to specifically@en
INFO	lowercase_definition	IAO:0000221	IAO:0000115	"m is a quality measurement of q at t when
q is a quality
there is a measurement process p that has specified output m, a measurement datum, that is about q@en"
INFO	lowercase_definition	IAO:0000413	IAO:0000115	relates a process to a time-measurement-datum that represents the duration of the process@en
INFO	lowercase_definition	IAO:0000417	IAO:0000115	inverse of the relation of is quality measurement of@en
INFO	lowercase_definition	IAO:0000572	IAO:0000115	a planned process in which a document is created or added to by including the specified input in it.@en
INFO	lowercase_definition	IAO:0000581	IAO:0000115	relates a time stamped measurement datum to the time measurement datum that denotes the time when the measurement was taken@en
INFO	lowercase_definition	IAO:0000583	IAO:0000115	relates a time stamped measurement datum to the measurement datum that was measured@en
INFO	lowercase_definition	OBI:0000066	IAO:0000115	a planned process that consists of parts: planning, study design execution, documentation and which produce conclusion(s).@en
INFO	lowercase_definition	OBI:0000067	IAO:0000115	a role that inheres in a material entity that is realized in an assay in which data is generated about the bearer of the evaluant role@en
INFO	lowercase_definition	OBI:0000079	IAO:0000115	a processed material that provides the needed nourishment for microorganisms or cells grown in vitro.
INFO	lowercase_definition	OBI:0000112	IAO:0000115	a role borne by a material entity that is gained during a specimen collection process and that can be realized by use of the specimen in an investigation@en
INFO	lowercase_definition	OBI:0000181	IAO:0000115	a population is a collection of individuals from the same taxonomic class living, counted or sampled at a particular site or in a particular area@en
INFO	lowercase_definition	OBI:0000274	IAO:0000115	is a process with the objective to place a material entity bearing the 'material to be added role' into a material bearing the 'target of material addition role'.@en
INFO	lowercase_definition	OBI:0000319	IAO:0000115	material to be added role is a protocol participant role realized by a material which is added into a material bearing the target of material addition role in a material addition process@en
INFO	lowercase_definition	OBI:0000339	IAO:0000115	a process of creating or modifying a plan specification@en
INFO	lowercase_definition	OBI:0000416	IAO:0000115	cloning insert role is a role which inheres in DNA or RNA and is realized by the process of being inserted into a cloning vector in a cloning process.@en
INFO	lowercase_definition	OBI:0000423	IAO:0000115	an extract is a material entity which results from an extraction process@en
INFO	lowercase_definition	OBI:0000427	IAO:0000115	"(protein or rna) or has_part (protein or rna) and
has_function some GO:0003824 (catalytic activity)@en"
INFO	lowercase_definition	OBI:0000434	IAO:0000115	is the specification of an objective to add a material into a target material. The adding is asymmetric in the sense that the target material largely retains its identity@en
INFO	lowercase_definition	OBI:0000435	IAO:0000115	"an assay which generates data about a genotype from a specimen of genomic DNA. A variety of
techniques and instruments can be used to produce information about sequence variation at particular genomic positions.@en"
INFO	lowercase_definition	OBI:0000437	IAO:0000115	an assay objective to determine the presence or concentration of an analyte in the evaluant@en
INFO	lowercase_definition	OBI:0000441	IAO:0000115	an objective specification to determine a specified type of information about an evaluated entity (the material entity bearing evaluant role)@en
INFO	lowercase_definition	OBI:0000444	IAO:0000115	target of material addition role is a role realized by an entity into which a material is added in a material addition process
INFO	lowercase_definition	OBI:0000456	IAO:0000115	an objective specifiction that creates an specific output object from input materials.@en
INFO	lowercase_definition	OBI:0000471	IAO:0000115	a planned process that carries out a study design
INFO	lowercase_definition	OBI:0000639	IAO:0000115	is an objective to transform a material entity into spatially separated components.
INFO	lowercase_definition	OBI:0000643	IAO:0000115	the relation of the cells in the finger of the skin to the finger, in which an indeterminate number of grains are parts of the whole by virtue of being grains in a collective that is part of the whole, and in which removing one granular part does not nec- essarily damage or diminish the whole. Ontological Whether there is a fixed, or nearly fixed number of parts - e.g. fingers of the hand, chambers of the heart, or wheels of a car - such that there can be a notion of a single one being missing, or whether, by contrast, the number of parts is indeterminate - e.g., cells in the skin of the hand, red cells in blood, or rubber molecules in the tread of the tire of the wheel of the car.
INFO	lowercase_definition	OBI:0000652	IAO:0000115	is a material processing with the objective to combine two or more material entities as input into a single material entity as output.
INFO	lowercase_definition	OBI:0000671	IAO:0000115	a material obtained from an organism in order to be a representative of the whole
INFO	lowercase_definition	OBI:0000675	IAO:0000115	is a data transformation objective where the aim is to estimate statistical significance with the aim of proving or disproving a hypothesis by means of some data transformation
INFO	lowercase_definition	OBI:0000686	IAO:0000115	is an objective to obtain an output material that contains several input materials.
INFO	lowercase_definition	OBI:0000722	IAO:0000115	is a collection of short paired tags from the two ends of DNA fragments are extracted and covalently linked as ditag constructs
INFO	lowercase_definition	OBI:0000736	IAO:0000115	is a collection of short tags from DNA fragments, are extracted and covalently linked as single tag constructs
INFO	lowercase_definition	OBI:0000750	IAO:0000115	a directive information entity that is part of a study design. Independent variables are entities whose values are selected to determine its relationship to an observed phenomenon (the dependent variable). In such an experiment, an attempt is made to find evidence that the values of the independent variable determine the values of the dependent variable (that which is being measured). The independent variable can be changed as required, and its values do not represent a problem requiring explanation in an analysis, but are taken simply as given. The dependent variable on the other hand, usually cannot be directly controlled@en
INFO	lowercase_definition	OBI:0000751	IAO:0000115	dependent variable specification is part of a study design. The dependent variable is the event studied and expected to change when the independent variable varies.@en
INFO	lowercase_definition	OBI:0000811	IAO:0000115	a quality of a DNA molecule that inheres in its bearer due to the order of its DNA nucleotide residues.
INFO	lowercase_definition	OBI:0000838	IAO:0000115	a process with that achieves the objective to maintain some or all of the characteristics of an input material over time
INFO	lowercase_definition	OBI:0000931	IAO:0000115	the part of the execution of an intervention design study which is varied between two or more subjects in the study
INFO	lowercase_definition	OBI:0001032	IAO:0000115	a device which has a function to emit light.
INFO	lowercase_definition	OBI:0001143	IAO:0000115	a labeled specimen that is the output of a labeling process and has grain labeled nucleic acid for detection of the nucleic acid in future experiments.
INFO	lowercase_definition	OBI:0001225	IAO:0000115	a genetic characteristics information which is a part of genotype information that identifies the population of organisms@en
INFO	lowercase_definition	OBI:0001305	IAO:0000115	a genetic characteristics information that is about the genetic material of an organism and minimally includes information about the genetic background and can in addition contain information about specific alleles, genetic modifications, etc.@en
INFO	lowercase_definition	OBI:0001352	IAO:0000115	a genetic alteration information that about one of two or more alternative forms of a gene or marker sequence and differing from other alleles at one or more mutational sites based on sequence. Polymorphisms are included in this definition.@en
INFO	lowercase_definition	OBI:0001364	IAO:0000115	a genetic characteristics information that is about known changes or the lack thereof from the genetic background, including allele information, duplication, insertion, deletion, etc.@en
INFO	lowercase_definition	OBI:0001404	IAO:0000115	a data item that is about genetic material including polymorphisms, disease alleles, and haplotypes.@en
INFO	lowercase_definition	OBI:0001936	IAO:0000115	a material entity that is the specified output of an addition of molecular label process that aims to label some molecular target to allow for its detection in a detection of molecular label assay
INFO	lowercase_definition	OBI:0100064	IAO:0000115	a screening library is a collection of materials engineered to identify qualities of a subset of its members during a screening process?@en
INFO	lowercase_definition	OBI:0200044	IAO:0000115	an agglomerative hierarchical clustering which generates successive clusters based on a distance measure, where the distance between two clusters is calculated as the maximum distance between objects from the first cluster and objects from the second cluster.@en
INFO	lowercase_definition	OBI:0200066	IAO:0000115	a data transformation that performs more than one hypothesis test simultaneously, a closed-test procedure, that controls the familywise error rate for all the k hypotheses at level α in the strong sense. Objective: multiple testing correction
INFO	lowercase_definition	OBI:0300311	IAO:0000115	observation design is a study design in which subjects are monitored in the absence of any active intervention by experimentalists.@en
INFO	lowercase_definition	OBI:0302903	IAO:0000115	a planned process by which totally or partially complementary, single-stranded nucleic acids are combined into a single molecule called heteroduplex or homoduplex to an extent depending on the amount of complementarity.@en
INFO	lowercase_definition	OBI:0500002	IAO:0000115	a study design which use the same individuals and exposure them to a set of conditions. The effect of order and practice can be confounding factor in such designs@en
INFO	lowercase_definition	OBI:0500003	IAO:0000115	a repeated measure design which ensures that experimental units receive, in sequence, the treatment (or the control), and then, after a specified time interval (aka wash-out periods), switch to the control (or treatment). In this design, subjects (patients in human context) serve as their own controls, and randomization may be used to determine the ordering which a subject receives the treatment and control@en
INFO	lowercase_definition	OBI:0500014	IAO:0000115	factorial design is_a study design which is used to evaluate two or more factors simultaneously. The treatments are combinations of levels of the factors. The advantages of factorial designs over one-factor-at-a-time experiments is that they are more efficient and they allow interactions to be detected. In statistics, a factorial design experiment is an experiment whose design consists of two or more factors, each with discrete possible values or levels, and whose experimental units take on all possible combinations of these levels across all such factors. Such an experiment allows studying the effect of each factor on the response variable, as well as the effects of interactions between factors on the response variable.@en
INFO	lowercase_definition	OBI:0500015	IAO:0000115	a factorial design which has 2 experimental factors (aka independent variables) and 2 factor levels per experimental factors@en
INFO	lowercase_definition	OBI:0600005	IAO:0000115	a process with the objective to obtain a material entity that was part of an organism for potential future use in an investigation@en
INFO	lowercase_definition	OBI:0600014	IAO:0000115	a material processing in which components of an input material become segregated in space@en
INFO	lowercase_definition	OBI:0600015	IAO:0000115	group assignment is a process which has an organism as specified input and during which a role is assigned@en
INFO	lowercase_definition	OBI:0600024	IAO:0000115	a protocol application in which cells are kept alive in a defined environment outside of an organism. part of cell_culturing@en
INFO	lowercase_definition	OBI:0600036	IAO:0000115	a process through which a new type of cell culture or cell line is created, either through the isolation and culture of one or more cells from a fresh source, or the deliberate experimental modification of an existing cell culture (e.g passaging a primary culture to become a secondary culture or line, or the immortalization or stable genetic modification of an existing culture or line).
INFO	lowercase_definition	OBI:0600038	IAO:0000115	a material processing technique intended to add a molecular label to some input material entity, to allow detection of the molecular target of this label in a detection of molecular label assay@en
INFO	lowercase_definition	OBI:0600047	IAO:0000115	the use of a chemical or biochemical means to infer the sequence of a biomaterial@en
INFO	lowercase_definition	OBI:0600064	IAO:0000115	a planned process with the objective to insert genetic material into a cloning vector for future replication of the inserted material@en
INFO	lowercase_definition	OBI:0666667	IAO:0000115	a material separation to recover the nucleic acid fraction of an input material@en
INFO	lowercase_definition	OBI:1000029	IAO:0000115	a phage display library is a collection of materials in which a mixture of genes or gene fragments is expressed and can be individually selected and amplified.@en
INFO	lowercase_definition	OBI:1110108	IAO:0000115	a material that is added to another one in a material combination process
INFO	lowercase_definition	obo:REO_0000171	IAO:0000115	a reagent role inhering in a molecular entity intended to associate with some molecular target to serve as a proxy for the presence, abundance, or location of this target in a detection of molecular label assay.
INFO	lowercase_definition	obo:REO_0000280	IAO:0000115	a molecular reagent intended to associate with some molecular target to serve as a proxy for the presence, abundance, or location of this target in a detection of molecular label assay
INFO	lowercase_definition	RO:0000052	IAO:0000115	a relation between a specifically dependent continuant (the dependent) and an independent continuant (the bearer), in which the dependent specifically depends on the bearer for its existence@en
INFO	lowercase_definition	RO:0000053	IAO:0000115	a relation between an independent continuant (the bearer) and a specifically dependent continuant (the dependent), in which the dependent specifically depends on the bearer for its existence@en
INFO	lowercase_definition	RO:0000056	IAO:0000115	a relation between a continuant and a process, in which the continuant is somehow involved in the process@en
INFO	lowercase_definition	RO:0000057	IAO:0000115	a relation between a process and a continuant, in which the continuant is somehow involved in the process@en
INFO	lowercase_definition	RO:0000079	IAO:0000115	a relation between a function and an independent continuant (the bearer), in which the function specifically depends on the bearer for its existence@en
INFO	lowercase_definition	RO:0000080	IAO:0000115	a relation between a quality and an independent continuant (the bearer), in which the quality specifically depends on the bearer for its existence@en
INFO	lowercase_definition	RO:0000081	IAO:0000115	a relation between a role and an independent continuant (the bearer), in which the role specifically depends on the bearer for its existence@en
INFO	lowercase_definition	RO:0000085	IAO:0000115	a relation between an independent continuant (the bearer) and a function, in which the function specifically depends on the bearer for its existence@en
INFO	lowercase_definition	RO:0000086	IAO:0000115	a relation between an independent continuant (the bearer) and a quality, in which the quality specifically depends on the bearer for its existence@en
INFO	lowercase_definition	RO:0000087	IAO:0000115	a relation between an independent continuant (the bearer) and a role, in which the role specifically depends on the bearer for its existence@en
INFO	lowercase_definition	RO:0001000	IAO:0000115	a relation between two distinct material entities, the new entity and the old entity, in which the new entity begins to exist when the old entity ceases to exist, and the new entity inherits the significant portion of the matter of the old entity@en
INFO	lowercase_definition	RO:0001001	IAO:0000115	a relation between two distinct material entities, the old entity and the new entity, in which the new entity begins to exist when the old entity ceases to exist, and the new entity inherits the significant portion of the matter of the old entity@en
INFO	lowercase_definition	RO:0001015	IAO:0000115	a relation between two independent continuants, the location and the target, in which the target is entirely within the location@en
INFO	lowercase_definition	RO:0001025	IAO:0000115	a relation between two independent continuants, the target and the location, in which the target is entirely within the location@en
INFO	lowercase_definition	RO:0001901	IAO:0000115	"

## Elucidation

This is used when the statement/axiom is assumed to hold true 'eternally'

## How to interpret (informal)

First the \""atemporal\"" FOL is derived from the OWL using the standard
interpretation. This axiom is temporalized by embedding the axiom
within a for-all-times quantified sentence. The t argument is added to
all instantiation predicates and predicates that use this relation.

## Example

Class: nucleus
SubClassOf: part_of some cell

forall t :
forall n :
instance_of(n,Nucleus,t)
implies
exists c :
instance_of(c,Cell,t)
part_of(n,c,t)

## Notes

This interpretation is not the same as an at-all-times relation

"
INFO	lowercase_definition	STATO:0000001	IAO:0000115	property to indicate that a design declares a variable; the inverse property is 'is declared by'@en
INFO	lowercase_definition	STATO:0000002	IAO:0000115	an electronic file is an information content entity which conforms to a specification or format and which is meant to hold data and information in digital form, accessible to software agents@en
INFO	lowercase_definition	STATO:0000003	IAO:0000115	a balanced design is a an experimental design where all experimental group have the an equal number of subject observations@en
INFO	lowercase_definition	STATO:0000004	IAO:0000115	property to indicate the variables declared by a design; the inverse property is 'declares'@en
INFO	lowercase_definition	STATO:0000005	IAO:0000115	a single factor design is a study design which declares exactly 1 independent variable@en
INFO	lowercase_definition	STATO:0000006	IAO:0000115	x-axis is a cartesian coordinate axis which is orthogonal to the y-axis and the z-axis@en
INFO	lowercase_definition	STATO:0000007	IAO:0000115	an axis is a line graph used as reference line for the measurement of coordinates.@en
INFO	lowercase_definition	STATO:0000008	IAO:0000115	y-axis is a cartesian coordinate axis which is orthogonal to the x-axis and the z-axis@en
INFO	lowercase_definition	STATO:0000011	IAO:0000115	a cartesian axis is one of 3 the axis in a cartesian coordinate system defining a referential in 3 dimensions. each of the axis is orthogonal to the other 2@en
INFO	lowercase_definition	STATO:0000012	IAO:0000115	z-axis is a cartesian coordinate axis which is orthogonal to the x-axis and the y-axis@en
INFO	lowercase_definition	STATO:0000013	IAO:0000115	a 2 dimensional cartesian coordinate system is a cartesian coordinate system which defines 2 orthogonal one dimensional axes and which may be used to describe a 2 dimensional spatial region.
INFO	lowercase_definition	STATO:0000019	IAO:0000115	normal distribution hypothesis is a goodness of fit hypothesis stating that the distribution computed from the sample population fits a normal distribution.@en
INFO	lowercase_definition	STATO:0000021	IAO:0000115	a confidence interval which covers 90% of the sampling distribution, meaning that there is a 90% risk of false positive (type I error)@en
INFO	lowercase_definition	STATO:0000024	IAO:0000115	a three dimensional cartesian coordinate system is a cartesian coordinate system which defines 3 orthogonal one dimensional axes and which may be used to describe a 3 dimensional spatial region.
INFO	lowercase_definition	STATO:0000027	IAO:0000115	linkage between 2 categorical variable test is a statistical test which evaluates if there is an association between a predictor variable assuming discrete values and a response variable also assuming discrete values@en
INFO	lowercase_definition	STATO:0000028	IAO:0000115	measure of variation or statistical dispersion is a data item which describes how much a theoritical distribution or dataset is spread.@en
INFO	lowercase_definition	STATO:0000029	IAO:0000115	a measure of central tendency is a data item which attempts to describe a set of data by identifying the value of its centre.@en
INFO	lowercase_definition	STATO:0000031	IAO:0000115	binary classification (or binomial classification) is a data transformation which aims to cast members of a set into 2 disjoint groups depending on whether the element have a given property/feature or not.@en
INFO	lowercase_definition	STATO:0000032	IAO:0000115	an alternative term used for STATO statistical ontology and ISA team@en
INFO	lowercase_definition	STATO:0000034	IAO:0000115	a model parameter is a data item which is part of a model and which is meant to characterize an theoritecal or unknown population. a model parameter may be estimated by considering the properties of samples presumably taken from the theoritecal population@en
INFO	lowercase_definition	STATO:0000035	IAO:0000115	the range is a measure of variation which describes the difference between the lowest score and the highest score in a set of numbers (a data set)
INFO	lowercase_definition	STATO:0000038	IAO:0000115	a set of 2 subjects which result from a pairing process which assigns subject to a set based on a pairing rule/criteria@en
INFO	lowercase_definition	STATO:0000039	IAO:0000115	a statistic is a measurement datum to describe a dataset or a variable. It is generated by a calculation on set of observed data.@en
INFO	lowercase_definition	STATO:0000040	IAO:0000115	an MA plot is a scatter plot of the log intensity ratios M = log_2(T/R) versus the average log intensities A = log_2(T*T)/2, where T and R represent the signal intensities in the test and reference channels respectively.@en
INFO	lowercase_definition	STATO:0000041	IAO:0000115	a R command syntax or link to a R documentation in support of Statistical Ontology Classes or Data Transformations@en
INFO	lowercase_definition	STATO:0000043	IAO:0000115	a false positive rate whose value is 5 per cent@en
INFO	lowercase_definition	STATO:0000044	IAO:0000115	one-way anova is an analysis of variance where the different groups being compared are associated with the factor levels of only one independent variable. The null hypothesis is an absence of difference between the means calculated for each of the groups. The test assumes normality and equivariance of the data.@en
INFO	lowercase_definition	STATO:0000045	IAO:0000115	two-way anova is an analysis of variance where the different groups being compared are associated the factor levels of exatly 2 independent variables. The null hypothesis is an absence of difference between the means calculated for each of the groups. The test assumes normality and equivariance of the data.@en
INFO	lowercase_definition	STATO:0000046	IAO:0000115	a block design is a kind of study design which declares a blocking variable (also known as nuisance variable) in order to account for a known source of variation and reduce its impact on the acquisition of the signal@en
INFO	lowercase_definition	STATO:0000047	IAO:0000115	a count is a data item denoted by an integer and representing the number of instances or occurences of an entity@en
INFO	lowercase_definition	STATO:0000050	IAO:0000115	signal to noise ratio is a measurement datum comparing the amount of meaningful, useful or interesting data (the signal) to the amount of irrelevant or false data (the noise). Depending on the field and domain of application, different variables will be used to determinate a 'signal to noise ratio'. In statistics, the definition of signal to noise ratio is the ratio of the mean of a measurement to its standard deviation. It thus corresponds to the inverse of the coefficient of variation@en
INFO	lowercase_definition	STATO:0000053	IAO:0000115	a false positive rate is a data item which accounts for the proportion of incorrect rejection of a true null hypothesis.@en
INFO	lowercase_definition	STATO:0000054	IAO:0000115	homoskedasticity states that all variances under consideration are homogenous.@en
INFO	lowercase_definition	STATO:0000055	IAO:0000115	chromosome coordinate system is a genomic coordinate which uses chromosome of a particular assembly build process to define start and end positions. This coordinate system is unstable and will change with each new genome sequence assembly build.@en
INFO	lowercase_definition	STATO:0000056	IAO:0000115	a null hypothesis which states that no linkage exists between 2 categorical variables@en
INFO	lowercase_definition	STATO:0000058	IAO:0000115	goodness of fit hypothesis is a null hypothesis stating that the distribution computed from the sample population fits a theoretical distribution or that a dataset can be correctly explained by a model@en
INFO	lowercase_definition	STATO:0000059	IAO:0000115	the Student's t distribution is a continuous probability distribution which arises when estimating the mean of a normally distributed population in situations where the sample size is small and population standard deviation is unknown.@en
INFO	lowercase_definition	STATO:0000060	IAO:0000115	hypergeometric distribution is a probability distribution that describes the probability of k successes in n draws from a finite population of size N containing K successes without replacement@en
INFO	lowercase_definition	STATO:0000062	IAO:0000115	is a null hypothesis stating that there are no difference observed across a series of measurements made one same subject.@en
INFO	lowercase_definition	STATO:0000063	IAO:0000115	genomic coordinate datum is a data item which denotes a genomic position expressed using a genomic coordinate system@en
INFO	lowercase_definition	STATO:0000064	IAO:0000115	sequence read count is a data item determining how many sequence reads have been generated by a DNA sequencing assay for a given stretch of DNA
INFO	lowercase_definition	STATO:0000067	IAO:0000115	a continuous probability distribution is a probability distribution which is defined by a probability density function@en
INFO	lowercase_definition	STATO:0000071	IAO:0000115	reaction rate is a measurement datum which represents the speed of a chemical reaction turning reactive species into product species of event (i.e the number of such conversions)s occuring over a time interval@en
INFO	lowercase_definition	STATO:0000072	IAO:0000115	substrate concentration is a scalar measurement datum which denotes the amount of molecular entity involved in an enzymatic reaction (or catalytic chemical reaction) and whose role in that reaction is as substrate.@en
INFO	lowercase_definition	STATO:0000075	IAO:0000115	a rarefaction curve is a graph used for estimating species richness in ecology studies@en
INFO	lowercase_definition	STATO:0000080	IAO:0000115	"the Brown Forsythe test is a statistical test which evaluates if the variance of different groups are equal. It relies on computing the median rather than the mean, as used in the Levene's test for homoschedacity.
This test maybe used to, for instance, ensure that the conditions of applications of ANOVA are met.@en"
INFO	lowercase_definition	STATO:0000082	IAO:0000115	a fixed effect model is a statistical model which represents the observed quantities in terms of explanatory variables that are treated as if the quantities were non-random.@en
INFO	lowercase_definition	STATO:0000084	IAO:0000115	multinomial logistic regression model is a model which attempts to explain data distribution associated with polychotomous response/dependent variable in terms of values assumed by the independent variable uses a function of predictor/independent variable(s): the function used in this instance of regression modeling is probit function.@en
INFO	lowercase_definition	STATO:0000085	IAO:0000115	effect size estimate is a data item about the direction and strength of the consequences of a causative agent as explored by statistical methods. Those methods produce estimates of the effect size, e.g. confidence interval@en
INFO	lowercase_definition	STATO:0000086	IAO:0000115	an F-test is a statistical test which evaluates that the computed test statistics follows an F-distribution under the null hypothesis. The F-test is sensitive to departure from normality. F-test arise when decomposing the variability in a data set in terms of sum of squares.@en
INFO	lowercase_definition	STATO:0000087	IAO:0000115	a polychotomous variable is a categorical variable which is defined to have minimally 2 categories or possible values@en
INFO	lowercase_definition	STATO:0000088	IAO:0000115	statistical sample size is a count evaluating the number of individual experimental units@en
INFO	lowercase_definition	STATO:0000089	IAO:0000115	a case-control study design is a observation study design which assess the risk of particular outcome (a trait or a disease) associated with an event (either an exposure or endogenous factor). A case-control study design therefore declares an exposure variable which is dichotomous in nature (exposed/non-exposed) and an outcome variable, which is also dichotomous (case or control), thus giving the name to the design. During the execution of the design, a case control study defines a population and counts the events to determine their frequency.@en
INFO	lowercase_definition	STATO:0000090	IAO:0000115	a dichotomous variable is a categorical variable which is defined to have only 2 categories or possible values@en
INFO	lowercase_definition	STATO:0000095	IAO:0000115	paired t-test is a statistical test which is specifically designed to analysis differences between paired observations in the case of studies realizing repeated measures design with only 2 repeated measurements per subject (before and after treatment for example)@en
INFO	lowercase_definition	STATO:0000096	IAO:0000115	stratification is a planned process which executes a stratification rule using as input a population and assign it member to mutually exclusive subpopulation based on the values defined by the stratification rule@en
INFO	lowercase_definition	STATO:0000099	IAO:0000115	a random effect(s) model, also called a variance components model, is a kind of hierarchical linear model. It assumes that the dataset being analysed consists of a hierarchy of different populations whose differences relate to that hierarchy.@en
INFO	lowercase_definition	STATO:0000100	IAO:0000115	"standardized mean difference is statistic computed by forming the difference between two means, divided by an estimate of the within-group standard deviation.
It is used to provide an estimation of the effect size between two treatments when the predictor (independent variable) is categorical and the response(dependent) variable is continuous.

A standardized mean difference is a statistic that is a difference between two means, divided by a statistical measure of dispersion.

The term Standardized Mean Difference is a description of the concept without an explicit type of statistical measure of dispersion. If the statistical measure of dispersion is specified, then a type (child term) of Standardized Mean Difference is preferred.@en"
INFO	lowercase_definition	STATO:0000101	IAO:0000115	the relationship between a fraction and the number above the line@en
INFO	lowercase_definition	STATO:0000102	IAO:0000115	relationship between a planned process and the plan specification that it carries out; it is defined as equivalent to the composed relationship (realizes o concretizes)@en
INFO	lowercase_definition	STATO:0000103	IAO:0000115	the multinomial distribution is a probability distribution which gives the probability of any particular combination of numbers of successes for various categories defined in the context of n independent trials each of which leads to a success for exactly one of k categories, with each category having a given fixed success probability.@en
INFO	lowercase_definition	STATO:0000105	IAO:0000115	log signal intensity ratio is a data item which corresponding the logarithmitic base 2 of the ratio between 2 signal intensity, each corresponding to a condition.@en
INFO	lowercase_definition	STATO:0000106	IAO:0000115	probit regression model is a model which attempts to explain data distribution associated with dichotomous response/dependent variable in terms of values assumed by the independent variable uses a function of predictor/independent variable(s): the function used in this instance of regression modeling is the probit function aka the quantile function, i.e., the inverse cumulative distribution function (CDF), associated with the standard normal distribution.@en
INFO	lowercase_definition	STATO:0000107	IAO:0000115	a statistical model is an information content entity which is a formalization of relationships between variables in the form of mathematical equations. A statistical model describes how one or more random variables are related to one or more other variables. The model is statistical as the variables are not deterministically but stochastically related.@en
INFO	lowercase_definition	STATO:0000108	IAO:0000115	"linear regression model is a model which attempts to explain data distribution associated with response/dependent variable in terms of values assumed by the independent variable uses a linear function or linear combination of the regression parameters and the predictor/independent variable(s).
linear regression modeling makes a number of assumptions, which includes homoskedasticity (constance of variance)@en"
INFO	lowercase_definition	STATO:0000109	IAO:0000115	multinomial logistic regression model is a model which attempts to explain data distribution associated with polychotomous response/dependent variable in terms of values assumed by the independent variable uses a function of predictor/independent variable(s): the function used in this instance of regression modeling is logistic function.@en
INFO	lowercase_definition	STATO:0000111	IAO:0000115	a sequence read is a DNA sequence data which is generated by a DNA sequencer@en
INFO	lowercase_definition	STATO:0000112	IAO:0000115	"a Funnel plot is a scatter plot of treatment effect versus a measure of study size and aims to provide a visual aid to detecting bias or systematic heterogeneity. A symmetric inverted funnel shape arises from a ‘well-behaved’ data set, in which publication bias is unlikely. An asymmetric funnel indicates a relationship between treatment effect and study size.
Known caveats: If high precision studies really are different from low precision studies with respect to effect size (e.g., due to different populations examined) a funnel plot may give a wrong impression of publication bias. The appearance of the funnel plot can change quite dramatically depending on the scale on the y-axis — whether it is the inverse square error or the trial size.
Funnel plot was introduced by Light and Palmer in 1984.@en"
INFO	lowercase_definition	STATO:0000113	IAO:0000115	variance is a data item about a random variable or probability distribution. it is equivalent to the square of the standard deviation. It is one of several descriptors of a probability distribution, describing how far the numbers lie from the mean (expected value).The variance is the second moment of a distribution.@en
INFO	lowercase_definition	STATO:0000114	IAO:0000115	relationship between an element and a set it belongs to@en
INFO	lowercase_definition	STATO:0000115	IAO:0000115	relationship between a set and one of its elements@en
INFO	lowercase_definition	STATO:0000116	IAO:0000115	"the process of using statistical analysis for interpreting and communicating \""what the data say\"".@en"
INFO	lowercase_definition	STATO:0000117	IAO:0000115	a discrete probability distribution is a probability distribution which is defined by a probability mass function where the random variable can only assume a finite number of values or infinitely countable values@en
INFO	lowercase_definition	STATO:0000118	IAO:0000115	ranking is a data transformation which turns a non-ordinal variable into a Ordinal variable by sorting the values of the input variable and replacing their value by their position in the sorting result@en
INFO	lowercase_definition	STATO:0000119	IAO:0000115	model parameter estimation is a data transformation that finds parameter values (the model parameter estimates) most compatible with the data as judged by the model.@en
INFO	lowercase_definition	STATO:0000120	IAO:0000115	"beanplot is a plot in which (one or) multiple batches (\""beans\"") are shown. Each bean consists of a density trace, which is mirrored to
form a polygon shape. Next to that, a one-dimensional scatter plot shows all the individual measurements, like in a stripchart.

The name beanplot stems from green beans. The density shape can be seen as the pod of a green bean, while the scatter plot shows the seeds inside the pod.@en"
INFO	lowercase_definition	STATO:0000121	IAO:0000115	the objective of a data transformation to evaluate a null hypothesis of absence of linkage between variables.@en
INFO	lowercase_definition	STATO:0000122	IAO:0000115	a pedigree chart is a graph which plots parent child relations@en
INFO	lowercase_definition	STATO:0000123	IAO:0000115	r2 is a correlation coefficient which is computed over the frequency of 2 dichotomous variable and is used as a measure of Linkage Disequilibrium and as input data item to the creation of an LD plot@en
INFO	lowercase_definition	STATO:0000124	IAO:0000115	a stratification rule/criteria is a criteria used to determine population strata so that a stratification process implementing the rule can result in any member of the total population being assigned to one and only one stratum@en
INFO	lowercase_definition	STATO:0000126	IAO:0000115	"volcano plot is a kind of scatter plot which graphs the negative log of the p-value (significance) on the y-axis versus log2 of fold-change between 2 conditions on the x-axis.
It is a popular method for visualizing differential occurence of variables between 2 conditions.@en"
INFO	lowercase_definition	STATO:0000127	IAO:0000115	a confidence interval which covers 99% of the sampling distribution, meaning that there is a 1% risk of false positive (type I error)@en
INFO	lowercase_definition	STATO:0000130	IAO:0000115	the Breslow-Day test is a statistical test which evaluates if the odds ratios are homogenous across N 2x2 contingency tables, for instance several 2x2 contingency tables associated with different strata of a stratified population when evaluating the relationship between exposure and outcome or associated with the different samples coming from several centres in a multicentric study in clinical trial context.@en
INFO	lowercase_definition	STATO:0000131	IAO:0000115	a sphericity test is a null hypothesis statistical testing procedure which posits a null hypothesis of equality of the variances of the differences between levels of the repeated measures factor@en
INFO	lowercase_definition	STATO:0000134	IAO:0000115	specificity is a measurement datum qualifying a binary classification test and is computed by substracting the false positive rate to the integral numeral 1@en
INFO	lowercase_definition	STATO:0000135	IAO:0000115	"strictly standardized mean difference (SSMS) is a standardized mean difference which corresponds to the ratio of mean to the standard deviation of the difference between two groups.
SSMD directly measures the magnitude of difference between two groups.
SSMD is widely used in High Content Screen for hit selection and quality control.

When the data is preprocessed using log-transformation as normally done in HTS experiments, SSMD is the mean of log fold change divided by the standard deviation of log fold change with respect to a negative reference.

In other words, SSMD is the average fold change (on the log scale) penalized by the variability of fold change (on the log scale).

For quality control, one index for the quality of an HTS assay is the magnitude of difference between a positive control and a negative reference in an assay plate. For hit selection, the size of effects of a compound (i.e., a small molecule or an siRNA) is represented by the magnitude of difference between the compound and a negative reference. SSMD directly measures the magnitude of difference between two groups. Therefore, SSMD can be used for both quality control and hit selection in HTS experiments.@en"
INFO	lowercase_definition	STATO:0000137	IAO:0000115	an homoskedasticity test is a statistical test aiming at evaluate if the variances from several random samples are similar@en
INFO	lowercase_definition	STATO:0000138	IAO:0000115	a 2x2 contingency table is a contingency table build for 2 dichotomous variables (i.e. 2 categorical variables, each with only 2 possible outcomes). It is the simplest of contingency tables@en
INFO	lowercase_definition	STATO:0000139	IAO:0000115	a subject pairing is a planned process which executes a pairing rule and results in the creation of sets of 2 subjects meeting the pairing criteria@en
INFO	lowercase_definition	STATO:0000140	IAO:0000115	"a contigency table is a data item which displays the (multivariate) frequency distribution of the possible values of categorical variables.
The first row of the table corresponds to categories of one categorical variable, the first column of the table corresponds to categories of the other categorical variable, the cells corresponding to each combination of categories is filled with the observed occurences in the sample being considered.
The table also contains marginal total (marginal sums) and grand total of the occurences

The term contingency table was first used by Karl Pearson in \""On the Theory of Contingency and Its Relation to Association and Normal Correlation\"", part of the Drapers' Company Research Memoirs Biometric Series I published in 1904.@en"
INFO	lowercase_definition	STATO:0000141	IAO:0000115	acute toxicity study is an investigation which use interventions organized according to a factorial design and a parallel group design to observe the effect of use of high dose xenobiotics in animal models or cellular models@en
INFO	lowercase_definition	STATO:0000144	IAO:0000115	a model parameter estimate is a data item which results from a model parameter estimation process and which provides a numerical value about a model parameter.@en
INFO	lowercase_definition	STATO:0000145	IAO:0000115	"the geometric distribution is a negative binomial distribution where r is 1.
It is useful for modeling the runs of consecutive successes (or failures) in repeated independent trials of a system.

The geometric distribution models the number of successes before one failure in an independent succession of tests where each test results in success or failure.


The geometric distribution with prob = p has density

p(x) = p (1-p)^x

for x = 0, 1, 2, …, 0 < p ≤ 1.

If an element of x is not integer, the result of dgeom is zero, with a warning.

The quantile is defined as the smallest value x such that F(x) ≥ p, where F is the distribution function.@en"
INFO	lowercase_definition	STATO:0000146	IAO:0000115	a null hypothesis stating that there are differences observed between group of subjects@en
INFO	lowercase_definition	STATO:0000149	IAO:0000115	binomial logistic regression model is a model which attempts to explain data distribution associated with dichotomous response/dependent variable in terms of values assumed by the independent variable uses a function of predictor/independent variable(s): the function used in this instance of regression modeling is logistic function.@en
INFO	lowercase_definition	STATO:0000150	IAO:0000115	a minimum value is a data item which denotes the smallest value found in a dataset or resulting from a calculation.@en
INFO	lowercase_definition	STATO:0000151	IAO:0000115	maximum value is a data item which denotes the largest value found in a dataset or resulting from a calculation.@en
INFO	lowercase_definition	STATO:0000152	IAO:0000115	a quartile is a quantile which splits data into sections accrued of 25% of data, so the first quartile delineates 25% of the data, the second quartile delineates 50% of the data and the third quartile, 75 % of the data@en
INFO	lowercase_definition	STATO:0000154	IAO:0000115	"a violin plot is a plot combining the features of box plot and kernel density plot. The violin plot is therefore similar to box plot but it incorporated in the display the probability density of the data at different values.
Typically violin plots will include a marker for the median of the data and a box indicating the interquartile range, as in standard box plots.@en"
INFO	lowercase_definition	STATO:0000155	IAO:0000115	meta-analysis is a data transformation which uses the effect size estimates from several independent quantitative scientific studies addressing the same question in order to assess finding consistency.@en
INFO	lowercase_definition	STATO:0000156	IAO:0000115	the Scheffe test is a data transformation which evaluates all possible contrasts and adjusting the levels significance by accounting for multiple comparison. The test is therefore conservative. Confidence intervals can be constructed for the corresponding linear regression. It was developped by American statistician Henry Scheffe in 1959.@en
INFO	lowercase_definition	STATO:0000157	IAO:0000115	the LSD test is a statistical test for multiple comparisons of treatments by means of least significant difference following an ANOVA analysis
INFO	lowercase_definition	STATO:0000158	IAO:0000115	a null hypothesis which states that a linkage exists between 2 categorical variables@en
INFO	lowercase_definition	STATO:0000161	IAO:0000115	variable distribution is data item which denotes the spatial resolution of data point making up a variable. variable distribution may be compared to a known probability distribution using goodness of fit test or plotting a quantile-quantile plot for visual assessment of the fit.@en
INFO	lowercase_definition	STATO:0000162	IAO:0000115	the role played by an entity part of study group as defined by an experimental design and realized in a data analysis and data interpretation@en
INFO	lowercase_definition	STATO:0000163	IAO:0000115	trimmed mean or truncated mean is a measure of central tendency which involves the calculation of the mean after discarding given parts of a probability distribution or sample at the high and low end, and typically discarding an equal amount of both@en
INFO	lowercase_definition	STATO:0000165	IAO:0000115	a pie chart is a graph in which a circular graph is divided into sector illustrating numerical proportion, meaning that the arc length of each sector (and consequently its central angle and area), is proportional to the quantity it represents.@en
INFO	lowercase_definition	STATO:0000166	IAO:0000115	the bart chart is a graph resulting from plotting rectangular bars with lengths proportional to the values that they represent.
INFO	lowercase_definition	STATO:0000167	IAO:0000115	the first quartile is a quartile which splits the lower 25 % of the data@en
INFO	lowercase_definition	STATO:0000168	IAO:0000115	a real time quantitative pcr plot is a line graph which plots the signal fluorescence intensity as a function of the number of PCR cycle@en
INFO	lowercase_definition	STATO:0000170	IAO:0000115	the first quartile is a quartile which splits the 75 % of the data@en
INFO	lowercase_definition	STATO:0000173	IAO:0000115	"homogeneity testing objective is the objective of a data transformation to test a null hypothesis that two or more sub-groups of a population share the same distribution of a single categorical variable.
For example, do people of different countries have the same proportion of smokers to non-smokers@en"
INFO	lowercase_definition	STATO:0000175	IAO:0000115	confidence interval calculation is a data transformation which determines a confidence interval for a given statistical parameter@en
INFO	lowercase_definition	STATO:0000176	IAO:0000115	t-statistic is a statistic computed from observations and used to produce a p-value in statistical test when compared to a Student's t distribution.@en
INFO	lowercase_definition	STATO:0000177	IAO:0000115	the beta distribution is a continuous probability distributions defined on the interval [0, 1] parametrized by two positive shape parameters, denoted by α and β, that appear as exponents of the random variable and control the shape of the distribution@en
INFO	lowercase_definition	STATO:0000180	IAO:0000115	standard normal distribution is a normal distribution with variance = 1 and mean=0@en
INFO	lowercase_definition	STATO:0000183	IAO:0000115	sphericity testing objective is a statistical objective of a data transformation which aims to test a null hypothesis of sphericity holds.@en
INFO	lowercase_definition	STATO:0000185	IAO:0000115	a 2 by n contingency table is a contingency table built for one dichotomous variable (a categorical variable with only 2 outcomes) and one polychotomous variable (a polychomotomous variable with at least 2 outcomes)@en
INFO	lowercase_definition	STATO:0000188	IAO:0000115	average log signal intensity is a data time which corresponds to the sum of 2 distinct logarithm base 2 transformed signal intensity, each corresponding to a distinct condition of signal acquisition, divided by 2.@en
INFO	lowercase_definition	STATO:0000191	IAO:0000115	a goodness of fit statistical test is a statistical test which aim to evaluate if a sample distribution can be considered equivalent to a theoretical distribution used as input@en
INFO	lowercase_definition	STATO:0000192	IAO:0000115	a cartesian product is a data transformation which operates on a n Sets to produce a set of all possible ordered n-tuples where each element of the tuple comes from a Set
INFO	lowercase_definition	STATO:0000193	IAO:0000115	is a population whose individual members realize (may be expressed as) a combination of inclusion rule values specifications or resulting from a sampling process (e.g. recruitment followed by randomization to group) on which a number of measurements will be carried out, which may be used as input to statistical tests and statistical inference.
INFO	lowercase_definition	STATO:0000194	IAO:0000115	self explanatory@en
INFO	lowercase_definition	STATO:0000197	IAO:0000115	a genomic coordinate system is a coordinate system to describe position of sequence on a genomic scaffold (assembly of chromosome, contig....)@en
INFO	lowercase_definition	STATO:0000198	IAO:0000115	a statistical test which makes no assumption about the underlying data distribution@en
INFO	lowercase_definition	STATO:0000199	IAO:0000115	"the Mauchly's test for sphericity is a statistical test which evaluates if the variance of the differences between all combinations of the groups are equal, a property known as 'sphericity' in the context of repeated measures. It is used for instance prior to repeated measure ANOVA.
The test works by assessing if a Wishart-distributed covariance matrix (or transformation thereof) is proportional to a given matrix.@en"
INFO	lowercase_definition	STATO:0000200	IAO:0000115	the statistical test power is data item which is about a statistical test and is obtained by subtracting the false negative rate (type II error rate) to 1. The power of a statistical test is the probability that it will correctly lead to the rejection of a false null hypothesis (Greene 2000). The statistical power is the ability of a test to detect an effect, if the effect actually exists (High 2000).@en
INFO	lowercase_definition	STATO:0000202	IAO:0000115	within subject comparison statistical test is a kind of statistical test which evaluates if a change occurs within one experimental unit over time following a treatment or an event@en
INFO	lowercase_definition	STATO:0000203	IAO:0000115	a cohort is a study group population where the members are human beings which meet inclusion criteria and undergo a longitudinal design@en
INFO	lowercase_definition	STATO:0000204	IAO:0000115	the F-distribution is a continuous probability distribution which arises in the testing of whether two observed samples have the same variance.@en
INFO	lowercase_definition	STATO:0000207	IAO:0000115	a planned process which etablishes and states the different hypothesis to be evaluated during a null hypothesis statistical test@en
INFO	lowercase_definition	STATO:0000209	IAO:0000115	area under curve is a measurement datum which corresponds to the surface define by the x-axis and bound by the line graph represented in a 2 dimensional plot resulting from an integration or integrative calculus. The interpretation of this measurement datum depends on the variables plotted in the graph@en
INFO	lowercase_definition	STATO:0000210	IAO:0000115	is a data item formed by dividing the fluorescence intensity obtained in one channel to that obtained in the other channel, typically the case when considering 2-color microarray data when imaging is done for Cy3 and Cy5 dyes.@en
INFO	lowercase_definition	STATO:0000211	IAO:0000115	odds ratio homogeneity hypothesis is a null hypothesis stating that all odds ratio are homogenous, that is remain within the same range.@en
INFO	lowercase_definition	STATO:0000212	IAO:0000115	a tetrachoric correlation coefficient is a polychoric correlation coefficient for 2 dichotomous variables used as proxy for correlation between 2 continuous latent variables.@en
INFO	lowercase_definition	STATO:0000213	IAO:0000115	discretization as a processing converting a continuous variable into a polychotomous variable by concretizing a set of discretization rules@en
INFO	lowercase_definition	STATO:0000214	IAO:0000115	a confidence interval which covers 50% of the sampling distribution, meaning that there is a 50% risk of false positive (type I error)@en
INFO	lowercase_definition	STATO:0000215	IAO:0000115	probit regression model is a model which attempts to explain data distribution associated with ordinal response/dependent variable in terms of values assumed by the independent variable uses a function of predictor/independent variable(s): the function used in this instance of regression modeling is the ordered probit function.@en
INFO	lowercase_definition	STATO:0000216	IAO:0000115	a stratum population is a population resulting from a population stratification prior to sampling process which aims to produce homogenous subpopulations from an heterogeneous population by applying one or more stratification criteria@en
INFO	lowercase_definition	STATO:0000217	IAO:0000115	a null hypothesis which states that a given matrix is proportional to a Wishart-distributed covariance matrix@en
INFO	lowercase_definition	STATO:0000219	IAO:0000115	a real time pcr standard curve is a line graph which plots the fluorescence intensity signal as a function of the concentration of a sample used as reference and used to determine relative abundance of test samples@en
INFO	lowercase_definition	STATO:0000220	IAO:0000115	the false negative rate is a data item which denotes the proportion of missed detection of elements known to be meeting the detection criteria@en
INFO	lowercase_definition	STATO:0000221	IAO:0000115	a random variable (or aleatory variable or stochastic variable) in probability and statistics, is a variable whose value is subject to variations due to chance (i.e. randomness, in a mathematical sense)@en
INFO	lowercase_definition	STATO:0000222	IAO:0000115	graeco-latin square design is_a study design which allows in its simpler form controlling 3 levels of nuisance variables (also known as blocking variables). The 3 nuisance factors are divided into a tabular grid with the property that each row and each column receive each treatment exactly once.@en
INFO	lowercase_definition	STATO:0000223	IAO:0000115	group assignment based on blocking variable specification is a kind of group assignment process which takes into account the levels assumed by a blocking variable to allocate subjects or experimental units to a treatment group@en
INFO	lowercase_definition	STATO:0000227	IAO:0000115	"a normal distribution is a continuous probability distribution described by a probability distribution function described here:
http://mathworld.wolfram.com/NormalDistribution.html@en"
INFO	lowercase_definition	STATO:0000228	IAO:0000115	ordinal variable is a categorical variable where the discrete possible values are ordered or correspond to an implicit ranking@en
INFO	lowercase_definition	STATO:0000230	IAO:0000115	the expected value (or expectation, mathematical expectation, EV, mean, or the first moment) of a random variable is a data item which corresponds to the weighted average of all possible values that this random variable can take on. The weights used in computing this average correspond to the probabilities in case of a discrete random variable, or densities in case of a continuous random variable. From a rigorous theoretical standpoint, the expected value is the integral of the random variable with respect to its probability measure.@en
INFO	lowercase_definition	STATO:0000231	IAO:0000115	a confidence interval which covers 95% of the sampling distribution, meaning that there is a 5% risk of false positive (type I error). If the number of observations made is large enough, the sampling distribution can be assumed to be normal, which entails that 95% of the sampling distributions falls within roughly2 (1.96) standard deviations from the mean.@en
INFO	lowercase_definition	STATO:0000232	IAO:0000115	number of PCR cycle is a count which enumerates how many iterations of 'annealing, renaturation, amplification,' rounds (or cycles) are performed during a polymerase chain reaction (PCR) or an assay relying on PCR.@en
INFO	lowercase_definition	STATO:0000233	IAO:0000115	sensitivity is a measurement datum qualifying a binary classification test and is computed by substracting the false negative rate to the integral numeral 1@en
INFO	lowercase_definition	STATO:0000234	IAO:0000115	a residual is a data item which is the output of an error estimate or model fitting process and which is an observable estimate of the unobservable error@en
INFO	lowercase_definition	STATO:0000236	IAO:0000115	the coefficient of variation is a normalized measure of dispersion of a probability distribution of frequency distribution.@en
INFO	lowercase_definition	STATO:0000238	IAO:0000115	high content screening is a kind of investigation which uses a standardized cellular assays to test the effect of substances (RNAi or small molecules) held in libraries on a cellular phenotype. it relies on microscopy imaging and or flow-cytometry, robotic handling to ensure fast and high-throughput.@en
INFO	lowercase_definition	STATO:0000239	IAO:0000115	high throughput screening is a kind of investigation which uses a standardized assays (cell based, enzymatic or chemometric) to test the effect of substances (RNAi or small molecules) held in libraries on a very specific and measureable outcome (e.g fluorence intensity). it relies on robotic handling to ensure fast and high-throughput in assay performance, data acquisition and hit selection.@en
INFO	lowercase_definition	STATO:0000242	IAO:0000115	statistical error is an data item denoting the amount by which an observation differs from the expected value, being based on the whole statistical population from which the statistical unit was chosen randomly@en
INFO	lowercase_definition	STATO:0000243	IAO:0000115	a box plot is a graph which plots datasets relying on their quartiles and the interquartile range to create the box and the whiskers.@en
INFO	lowercase_definition	STATO:0000244	IAO:0000115	(Rn +) − (Rn −), where Rn + = (emission intensity of reporter dye)/(emission intensity of passive reference dye) in PCR with template and Rn − = (emission intensity of reporter dye)/(emission intensity of passive reference dye) in PCR without template or early cycles of a real-time reaction. Ct = threshold cycle, i.e., cycle at which a statistically significant increase in ΔRn is first detected@en
INFO	lowercase_definition	STATO:0000247	IAO:0000115	odds ratio homogeneity test is a statistical test which aims to evaluate that null the hypothesis of consistency odds ratio accross different strata of population is true or not@en
INFO	lowercase_definition	STATO:0000248	IAO:0000115	a blocking variable is a independent variable which is used in a blocking process part of an experiment with the purpose of maximizing the signal coming from the main variable.
INFO	lowercase_definition	STATO:0000249	IAO:0000115	a DNA microarray hybridization is an assay relying on nucleic acid hybridization , which uses a DNA microarray device and a nucleic acid as input. It precedes a data acquisition process@en
INFO	lowercase_definition	STATO:0000250	IAO:0000115	group comparison objective is a data transformation objective which aims to determine if 2 or more study group differ with respect to the signal of a response variable@en
INFO	lowercase_definition	STATO:0000252	IAO:0000115	a categorical variable is a variable which that can only assume a finite number of value and cast observation in a small number of categories@en
INFO	lowercase_definition	STATO:0000253	IAO:0000115	the objective of a data transformation to test a null hypothesis of absence of difference within subject holds.@en
INFO	lowercase_definition	STATO:0000255	IAO:0000115	the objective of a data transformation to test a null hypothesis of absence of difference withing subject holds.@en
INFO	lowercase_definition	STATO:0000256	IAO:0000115	a manhattan plot for gwas is a kind of scatter plot used to facilitate presentation of genome-wide association study (GWAS) data. Genomic coordinates are displayed along the X-axis, with the negative logarithm of the association P-value for each single nucleotide polymorphism displayed on the Y-axis.@en
INFO	lowercase_definition	STATO:0000258	IAO:0000115	a variable is a data item which can assume any of a set of values, either as determined by an agent or as randomly occuring through observation.@en
INFO	lowercase_definition	STATO:0000259	IAO:0000115	the relationship between a fraction and the number below the line (or divisor)@en
INFO	lowercase_definition	STATO:0000260	IAO:0000115	"repeated measure ANOVA is a kind of ANOVA specifically developed for non-independent observations as found when repeated measurements on the sample experimental unit.
repeated measure ANOVA is sensitive to departure from normality (evaluation using Bartlett's test), more so in the case of unbalanced groups (i.e. different sizes of sample populations).
Departure from sphericity (evaluation using Mauchly'test) used to be an issue which is now handled robustly by modern tools such as R's lme4 or nlme, which accommodate dependence assumptions other than sphericity.@en"
INFO	lowercase_definition	STATO:0000264	IAO:0000115	a factor level combination is one a possible sets of factor levels resulting from the cartesian product of sets of factor and their levels as defined in a factorial design@en
INFO	lowercase_definition	STATO:0000267	IAO:0000115	grouped bar chart is a kind of bar chart which juxtaposes the discrete values for each of the possible value of a given categorical variable, thus providing within group comparison. Grouped bar charts are good for comparing between each element in the categories, and comparing elements across categories. However, the grouping can make it harder to tell the difference between the total of each group.@en
INFO	lowercase_definition	STATO:0000269	IAO:0000115	polychoric correlation coefficient is a correlation coefficient which is computed over 2 variables to characterise an association by proxy with 2 (latent) variables which are assumed to be continuous and normally distributed.@en
INFO	lowercase_definition	STATO:0000270	IAO:0000115	a full factorial design is a factorial design which ensures that all possible factor level combinations are defined and used so all between group differences can be explored@en
INFO	lowercase_definition	STATO:0000271	IAO:0000115	permutation numbering is a data tranformation allowing to count the number of possible permutations of elements in a set of size n, each element occurring exactly once. This number is factorial n.@en
INFO	lowercase_definition	STATO:0000274	IAO:0000115	receiver operational characteristics curve is a graphical plot which illustrates the performance of a binary classifier system as its discrimination threshold (aka cut-off point) is varied by plotting sensitivity vs (1 − specificity)@en
INFO	lowercase_definition	STATO:0000277	IAO:0000115	hit selection is a planned process which in screening processes such as high-throughput screening, lead to the identification of perturbing agent which cause the typical signal generated by a standardized assay to significantly differ from the negative control. The selection hitself results from meeting or exceeding selection threshold (for instance 6 sigma from the mean or SSMD value beyond 5 when compared to positive controls or below -5 when compared to negative controls@en
INFO	lowercase_definition	STATO:0000278	IAO:0000115	pairing rule is a rule which specifies the criteria for deciding on how to associated any 2 entities.@en
INFO	lowercase_definition	STATO:0000279	IAO:0000115	between group comparison statistical test is a statistical test which aims to detect difference between the means computing for each of the study group populations@en
INFO	lowercase_definition	STATO:0000281	IAO:0000115	a false positive rate whose value is 1 per cent@en
INFO	lowercase_definition	STATO:0000283	IAO:0000115	negative binomial probability distribution is a discrete probability distribution of the number of successes in a sequence of Bernoulli trials before a specified (non-random) number of failures (denoted r) occur. The negative binomial distribution, also known as the Pascal distribution or Pólya distribution, gives the probability of r-1 successes and x failures in x+r-1 trials, and success on the (x+r)th trial.@en
INFO	lowercase_definition	STATO:0000285	IAO:0000115	hypergeometric test is a null hypothesis test which evaluates if a random variable follows a hypergeometric distribution. It is a test of goodness of fit to that distribution. The test is suited for situation aimed at assessing cases of sampling from a finite set without replacements. For instance, testing for enrichment or depletion of elements (e.g GO categories, genes)@en
INFO	lowercase_definition	STATO:0000286	IAO:0000115	"a one-tailed test is a statistical test which, assuming an unskewed probability distribution, allocates all of the significance level to evaluate only one hypothesis to explain a difference.
The one-tailed test provides more power to detect an effect in one direction by not testing the effect in the other direction.
one-tailed test should be preceded by two-tailed test in order to avoid missing out on detecting alternate effect explaining an observed difference.@en"
INFO	lowercase_definition	STATO:0000287	IAO:0000115	a two tailed test is a statistical test which assess the null hypothesis of absence of difference assuming a symmetric (not skewed) underlying probability distribution by allocating half of the significance level selected to each of the direction of change which could explain a difference (for example, a difference can be an excess or a loss).@en
INFO	lowercase_definition	STATO:0000289	IAO:0000115	"a design matrix is an information content entity which denotes a study design. The design matrix is a n by m matrix where n the number of rows, corresponds to the number of observations (4 rows if quadruplicates) and where m, the number of columns corresponds to the number of independent variables. Each element in the matrix correspond to a discretized value representing one of the factor levels for a given factor.
A design matrix can be used as input to statistical modeling or statistical analysis.

The design matrix contains data on the independent variables (also called explanatory variables) in statistical models which attempt to explain observed data on a response variable (often called a dependent variable) in terms of the explanatory variables. The theory relating to such models makes substantial use of matrix manipulations involving the design matrix: see for example linear regression. A notable feature of the concept of a design matrix is that it is able to represent a number of different experimental designs and statistical models, e.g., ANOVA, ANCOVA, and linear regression@en"
INFO	lowercase_definition	STATO:0000291	IAO:0000115	"a quantile is a data item which corresponds to specific elements x in the range of a variate X.
the k-th n-tile P_k is that value of x, say x_k, which corresponds to a cumulative frequency of Nk/n (Kenney and Keeping 1962). If n=4, the quantity is called a quartile, and if n=100, it is called a percentile.@en"
INFO	lowercase_definition	STATO:0000292	IAO:0000115	a decile is a quantile where n=10 and which splits data into sections accrued of 10% of data, so the first decile delineates 10% of the data, the second decile delineates 20% of the data and the nineth decile, 90 % of the data@en
INFO	lowercase_definition	STATO:0000293	IAO:0000115	a percentile is a quantile which splits data into sections accrued of 1% of data, so the first percentile delineates 1% of the data, the second quartile delineates 2% of the data and the 99th percentile, 99 % of the data@en
INFO	lowercase_definition	STATO:0000294	IAO:0000115	absence of negative difference hypothesis is a hypothesis which assumes that a difference significantly less than a threshold does not exist.@en
INFO	lowercase_definition	STATO:0000295	IAO:0000115	absence of negative difference hypothesis is a hypothesis which assumes that a difference significantly greater than a threshold does not exist.@en
INFO	lowercase_definition	STATO:0000296	IAO:0000115	absence of depletion difference hypothesis is a hypothesis which assumes that the representation of an element significantly greater than a threshold does not exist.@en
INFO	lowercase_definition	STATO:0000297	IAO:0000115	absence of depletion difference hypothesis is a hypothesis which assumes that the representation of an element significantly less than a threshold does not exist.@en
INFO	lowercase_definition	STATO:0000298	IAO:0000115	a binomial test is a statistical hypothesis test which evaluates if the observations made about a Bernoulli experiment , that is an experiment which tests the statistical significance of deviations from a theoretically expected distribution (the binomial distribution) of observations into 2 categories. It is a goodness of fit test.@en
INFO	lowercase_definition	STATO:0000302	IAO:0000115	"one sample t-test is a kind of Student's t-test which evaluates if a given sample can be reasonably assumed to be taken from the population.
The test compares the sample statistic (m) to the population parameter (M).

The one sample t-test is the small sample analog of the z test, which is suitable for large samples.@en"
INFO	lowercase_definition	STATO:0000303	IAO:0000115	"two sample t-test is a null hypothesis statistical test which is used to reject or accept the hypothesis of absence of difference between the means over 2 randomly sampled populations.
It uses a t-distribution for the test and assumes that the variables in the population are normally distributed and with equal variances.@en"
INFO	lowercase_definition	STATO:0000306	IAO:0000115	a polynomial contrast is a contrast which...@en
INFO	lowercase_definition	STATO:0000307	IAO:0000115	treatment contrast is a contrast which allows to test how linear model coefficients of categorical variables are interpreted in case where the “first” level (aka, the baseline) is included into the intercept and all subsequent levels have a coefficient that represents their difference from the baseline.@en
INFO	lowercase_definition	STATO:0000308	IAO:0000115	the sum contrast is a contrast in which each coefficient compares the corresponding level of the factor to the average of the other levels@en
INFO	lowercase_definition	STATO:0000311	IAO:0000115	"a central composite design is a study design which contains an imbedded factorial or fractional factorial design with center points that is augmented with a group of so-called 'star points' that allow estimation of curvature.
A CCD design with k factors has 2k star points.@en"
INFO	lowercase_definition	STATO:0000314	IAO:0000115	upper confidence limit is a data item which is a largest value bounding a confidence interval@en
INFO	lowercase_definition	STATO:0000315	IAO:0000115	lower confidence limit is a data item which is a lowest value bounding a confidence interval@en
INFO	lowercase_definition	STATO:0000316	IAO:0000115	root-mean-square standardized effect is a statistic which denotes effect size in the context of analysis of variance and corresponds to the square root of the arithmetic average of p standardized effects (effects normalized to be expressed in standard deviation units).@en
INFO	lowercase_definition	STATO:0000318	IAO:0000115	omega-squared is a effect size estimate for variance explained which is less biased than the eta-squared coefficient.@en
INFO	lowercase_definition	STATO:0000322	IAO:0000115	a contrast weight is a coefficient which multiplies a group mean, part of a linear combinaison defining a constrast as a weighted sum of group means, giving a 'weight' to a specific group mean hence the name.@en
INFO	lowercase_definition	STATO:0000323	IAO:0000115	a contrast weight matrix is a information content entity which holds a set of contrast weight, coefficient used in a weighting sum of means defining a contrast@en
INFO	lowercase_definition	STATO:0000324	IAO:0000115	contrast weight estimate is a model parameter estimate which results from the computation from the data and that is used as input to a model fitting process@en
INFO	lowercase_definition	STATO:0000326	IAO:0000115	corrected Akaike information criteria is a modified version of the Akaike information criterion.@en
INFO	lowercase_definition	STATO:0000333	IAO:0000115	kernel density estimation (KDE) is a non-parametric way to estimate the probability density function of a random variable@en
INFO	lowercase_definition	STATO:0000336	IAO:0000115	best linear unbiased prediction is a data transformation which predicts under the assumption that the variable(s) under consideration have a random effect
INFO	lowercase_definition	STATO:0000337	IAO:0000115	breeding value estimation is a data transformation process aiming at computing breeding value estimates of an organism given a set of genomic (SNP) observations, pedigree information and/or phenotypic observations.@en
INFO	lowercase_definition	STATO:0000338	IAO:0000115	breeding value estimation is a data transformation process aiming at computing breeding value estimates of an organism given a set of genomic (SNP) observations.
INFO	lowercase_definition	STATO:0000339	IAO:0000115	breeding value estimation is a data transformation process aiming at computing breeding value estimates of an organism given a set of pedigree information.
INFO	lowercase_definition	STATO:0000340	IAO:0000115	breeding value estimation is a data transformation process aiming at computing breeding value estimates of an organism given a set of phenotypic observations.
INFO	lowercase_definition	STATO:0000342	IAO:0000115	genomic selection objective is a data transformation objective which is a special case of marker-assisted selection in which genetic markers covering the whole genome are used so that all quantitative trait loci (QTL) are in linkage disequilibrium with at least one marker.@en
INFO	lowercase_definition	STATO:0000343	IAO:0000115	a dataset which is made up of genotypic information, that is presenting allele information at specific loci in a set of individuals of an organism.
INFO	lowercase_definition	STATO:0000344	IAO:0000115	'has effect on' is a special case of the 'is about' relationship to be used for mixed effect models@en
INFO	lowercase_definition	STATO:0000345	IAO:0000115	'has fixed effect on' is a special case of the 'is about' relationship to be used with fixed effect models@en
INFO	lowercase_definition	STATO:0000346	IAO:0000115	a covariance structure is a data item which is part of a regression model and which indicates a pattern in the covariance matrix. The nature of covariance structure is specified before the regression analysis and various covariance structure may be tested and evaluated using information criteria to help choose the most suiteable model@en
INFO	lowercase_definition	STATO:0000349	IAO:0000115	spatial linear geometric anisotropic covariance structure is a type of covariance structure characterized by its anisotropy, i.e., the variation of properties can be different in directions x and y, which is this case give linear features.@en
INFO	lowercase_definition	STATO:0000350	IAO:0000115	spatial spherical geometric anisotropic covariance structure is a type of covariance structure characterized by its anisotropy, i.e., the variation of properties can be different in directions x and y, which is this case give spherical features.@en
INFO	lowercase_definition	STATO:0000351	IAO:0000115	spatial gaussian geometric anisotropic covariance structure is a type of covariance structure characterized by its anisotropy, i.e., the variation of properties can be different in directions x and y, which is this case give gaussian features.@en
INFO	lowercase_definition	STATO:0000352	IAO:0000115	spatial exponential geometric anisotropic covariance structure is a type of covariance structure characterized by its anisotropy, i.e., the variation of properties can be different in directions x and y, which is this case give exponential features.@en
INFO	lowercase_definition	STATO:0000353	IAO:0000115	spatial exponential anisotropic covariance structure is a type of covariance structure characterized by its anisotropy, i.e., the variation of properties can be different in directions x and y, which is this case give exponential features.@en
INFO	lowercase_definition	STATO:0000354	IAO:0000115	the banded heterogeneous Toeplitz covariance structure is a type of coviance structure which is often used to analyzed and intepret repeated measure design.
INFO	lowercase_definition	STATO:0000358	IAO:0000115	a form of covariance structure used to provide analysis ground s in the context of repeated measures datasets (longitudinal, time series)@en
INFO	lowercase_definition	STATO:0000359	IAO:0000115	"factor-analytic structure is a covariance structure which is specified for q factors
equal diagonal factor-analytic covariance structure is a type of factor analytic covariance structure specified for q factors, which includes a diagonal component for repeated measures.@en"
INFO	lowercase_definition	STATO:0000360	IAO:0000115	no diagonal factor-analytic covariance structure is a type of factor analytic covariance structure specified for q factors, which does not include a diagonal component for repeated measures.@en
INFO	lowercase_definition	STATO:0000361	IAO:0000115	factor-analytic structure is a type of heterogeneous covariance structure which is specified for q factors@en
INFO	lowercase_definition	STATO:0000362	IAO:0000115	compound symmetry covariance structure is a covariance structure which means that all the variances are equal and all the covariances are equal.@en
INFO	lowercase_definition	STATO:0000363	IAO:0000115	heterogenous compound symmetry structure is a compound symmetry covariance structure which has a different variance parameter for each diagonal element, and it uses the square roots of these parameters in the off-diagonal entries.@en
INFO	lowercase_definition	STATO:0000364	IAO:0000115	first order autoregressive moving average covariance structure is a type of covariance structure which is used in the context of time series analysis@en
INFO	lowercase_definition	STATO:0000365	IAO:0000115	first order autoregressive covariance structure is a covariance structure where correlations among errors decline exponentially with distance@en
INFO	lowercase_definition	STATO:0000369	IAO:0000115	repeated measure analysis is a kind of data transformation which deals with signals measured in the same experimental units at different times and, possibly, under different conditions over a period of time. Data produced by longitudinal studies qualify for such analysis. Since measurements are made on the same experimental units a number of times, they are likely to be correlated. Repeated measure analysis usually takes into consideration the possibility of correlation with time. It does so by specifying covariance structure in the analysis@en
INFO	lowercase_definition	STATO:0000370	IAO:0000115	the ordinary least squares estimation is a model parameter estimation for a linear regression model when the errors are uncorrelated and equal in variance. Is the Best Linear Unbiased Estimation (BLUE) method under these assumptions, Uniformly Minimum-Variance Unbiased Estimator (UMVUE) with addition of a Gaussian assumption.@en
INFO	lowercase_definition	STATO:0000371	IAO:0000115	the weighted least squares estimation is a model parameter estimation for a linear regression model with errors that independent but have heterogeneous variance. Difficult to use use in practice, as weights must be set based on the variance which is usually unknown. If true variance is known, it is the Best Linear Unbiased Estimation (BLUE) method under these assumptions, Uniformly Minimum-Variance Unbiased Estimator (UMVUE) with addition of a Gaussian assumption.@en
INFO	lowercase_definition	STATO:0000372	IAO:0000115	"the generalized least squares estimation is a model parameter estimation for a linear regression model with errors that are dependent and (possibly) have heterogeneous variance. Difficult to use use in practice, as covariance matrix of the errors must known to \""whiten\"" data and model. If true covariance is known, it is the Best Linear Unbiased Estimation (BLUE) method under these assumptions, Uniformly Minimum-Variance Unbiased Estimator (UMVUE) with addition of a Gaussian assumption.@en"
INFO	lowercase_definition	STATO:0000373	IAO:0000115	the iteratively reweighted least squares estimation is a model parameter estimation which is a practical implementation of Weighted Least Squares, where the heterogeneous variances of the errors are estimated from the residuals of the regression model, providing an estimate for the weights. Each successive estimate of the weights improves the estimation of the regression parameters, which in turn are used to compute residuals and update the weights@en
INFO	lowercase_definition	STATO:0000374	IAO:0000115	the feasible generalized least squares estimation is a model parameter estimation which is a practical implementation of Generalised Least Squares, where the covariance of the errors is estimated from the residuals of the regression model, providing the information needed to whiten the data and model. Each successive estimate of the whitening matrix improves the estimation of the regression parameters, which in turn are used to compute residuals and update the whitening matrix.
INFO	lowercase_definition	STATO:0000375	IAO:0000115	a residual mean square is a data item which is obtained by dividing the sum of squared residuals (SSR) by the number of degrees of freedom
INFO	lowercase_definition	STATO:0000380	IAO:0000115	'has interaction effect on' is a special case of the 'is about' relationship to be used for mixed effect models@en
INFO	lowercase_definition	STATO:0000381	IAO:0000115	'has random effect on' is a special case of the 'is about' relationship to be used for random effect models@en
INFO	lowercase_definition	STATO:0000382	IAO:0000115	'has order in sequence' is a special case of the 'is about' relation being used to be able to enumerate the different terms within a linear mixed model formula (thus assinging and order to random effect terms, fixed effect terms, interaction effect terms and error terms).@en
INFO	lowercase_definition	STATO:0000383	IAO:0000115	a data transformation that finds a contrast value (the contrast estimate) by computing the weighted sum of model parameter estimates using a set of contrast weights.@en
INFO	lowercase_definition	STATO:0000384	IAO:0000115	estimate of a contrast obtained by computing the weighted sum of model parameter estimates using a set of contrast weights.@en
INFO	lowercase_definition	STATO:0000385	IAO:0000115	an estimate of the standard deviation of a contrast estimate sampling distribution.@en
INFO	lowercase_definition	STATO:0000389	IAO:0000115	"a power-law probability distribution is a probability distribution whose density function (or mass function in the discrete case) has the form

p(x) = L(x) . x^{-alpha}

where alpha is a parameter >1 and L(x) is a slowly varying function.@en"
INFO	lowercase_definition	STATO:0000391	IAO:0000115	an annotation property to provide a canonical command to invoke a method implementation using Python programming language@en
INFO	lowercase_definition	STATO:0000393	IAO:0000115	"the Pareto type-II probability distribution is a continuous probability distribution which is defined by a probability density function characterized by 2 parameters, alpha and lambda, 2 real, strictly positive numbers. alpha is known as the shape parameter while lambda is known as the scale parameter.

the function defines the probably of a continous random variable according to the following:

p(x) = {\alpha \over \lambda} \left[{1+ {x \over \lambda}}\right]^{-(\alpha+1)}, \qquad x \geq 0,@en"
INFO	lowercase_definition	STATO:0000401	IAO:0000115	"the sample mean of sample of size n with n observations is an arithmetic mean computed over n number of observations on a statistical sample.
The sample mean, denoted x¯ and read “x-bar,” is simply the average of the n data points x1, x2, ..., xn:

x¯=x1+x2+⋯+xnn=1n∑i=1nxi
The sample mean summarizes the \""location\"" or \""center\"" of the data.

the sample mean is a measure of dispersion of the observations made on the sample and provides an unbias estimate of the population mean@en"
INFO	lowercase_definition	STATO:0000402	IAO:0000115	"the population mean or distribution mean is a parameter of a probability distribution or population indicative of the data dispersion. For continous probabibility distribution, the population mean is computed using the probability density function, for discrete probability distributions, a mass density function is used instead.
A population mean can be estimated by computing a sample mean@en"
INFO	lowercase_definition	STATO:0000404	IAO:0000115	the most common series or system of written mathematical symbols used to represent the entity@en
INFO	lowercase_definition	STATO:0000409	IAO:0000115	the likelihood ratio is a ratio which is formed by dividing the post-test odds with the pre-test odds in the context of a Bayesian formulation@en
INFO	lowercase_definition	STATO:0000410	IAO:0000115	the likelihood ratio of negative results is a ratio which is formed by dividing the difference between 1 and sensitivity of the test by the specificity value of a test. This can be expressed also as dividing the probability of a person who has the disease testing negative by the probability of a person who does not have the disease testing negative.@en
INFO	lowercase_definition	STATO:0000411	IAO:0000115	the likelihood ratio of positive results is a ratio which is form by dividing the sensitivity value of a test by the difference between 1 and specificity of the test. This can be expressed also as dividing the probability of the test giving a positive result when testing an affected subject versus the probability of the test giving a positive result when a subject is not affected.@en
INFO	lowercase_definition	STATO:0000414	IAO:0000115	mortality is a ratio formed by the number of deaths due to a disease divided by the total population size.@en
INFO	lowercase_definition	STATO:0000415	IAO:0000115	"in the context of binary classification, accuracy is defined as the proportion of true results (both true positives and true negatives) to the total number of cases examined (the sum of true positive, true negative, false positive and false negative).

It can be understood as a measure of the proximity of measurement results to the true value.

Accuracy is a metric used in the context of classification tasks to evaluate the proportion of correctly predicted instances among the total instances.

Key Points:

Use Case: Classification performance evaluation.
Metric: Measures the proportion of correct predictions.
Interpretation: Higher values indicate better classification performance.@en"
INFO	lowercase_definition	STATO:0000416	IAO:0000115	"precision or positive predictive value is defined as the proportion of the true positives against all the positive results (both true positives and false positives)

A proportion in which the numerator represents the correctly detected items within the denominator that represents all items detected.@en"
INFO	lowercase_definition	STATO:0000418	IAO:0000115	a measure of heterogeneity in meta-analysis is a data item which aims to describe the variation in study outcomes between studies.@en
INFO	lowercase_definition	STATO:0000423	IAO:0000115	the proportion of individuals in a population with the outcome of interest@en
INFO	lowercase_definition	STATO:0000427	IAO:0000115	restricted maximum likelihood estimation is a kind of maximum likelihood estimation data transformation which estimates the variance components of random-effects in univariate and multivariate meta-analysis. in contrast to 'maximum likelihood estimation', reml can produce unbiased estimates of variance and covariance parameters.@en
INFO	lowercase_definition	STATO:0000428	IAO:0000115	"maximum likelihood estimation (MLE) is a method of estimating the parameters of a statistical model, given observations. MLE attempts to find the parameter values that maximize the likelihood function, given the observations.

The method of maximum likelihood is based on the likelihood function, {\displaystyle {\mathcal {L}}(\theta \,;x)} {\displaystyle {\mathcal {L}}(\theta \,;x)}. We are given a statistical model, i.e. a family of distributions {\displaystyle {f(\cdot \,;\theta )\mid \theta \in \Theta }} {\displaystyle {f(\cdot \,;\theta )\mid \theta \in \Theta }}, where {\displaystyle \theta } \theta denotes the (possibly multi-dimensional) parameter for the model. The method of maximum likelihood finds the values of the model parameter, {\displaystyle \theta } \theta , that maximize the likelihood function, {\displaystyle {\mathcal {L}}(\theta \,;x)} {\displaystyle {\mathcal {L}}(\theta \,;x)}. I@en"
INFO	lowercase_definition	STATO:0000430	IAO:0000115	a random effect meta analysis procedure defined by Hartung and Knapp and by Sidik and Jonkman which performs better than DerSimonian and Laird approach, especially when there is heterogeneity and the number of studies in the meta-analysis is small.@en
INFO	lowercase_definition	STATO:0000431	IAO:0000115	a meta analysis which relies on the computation of the DerSimonian and Leard estimator as a measure of heterogeneity over a set of studies.@en
INFO	lowercase_definition	STATO:0000432	IAO:0000115	"a meta analysis which relies on the computation of the Hunter and Schmidt estimator as a measure of heterogeneity over a set of studies by considering the weighted mean of the raw correlation coefficient. Hunter and Schmidt developed what is commonly termed validity generalization procedures (Schmidt and Hunter, 1977). These involve correcting the effect sizes in the meta-analysis for sampling, and measurement error
and range restriction.@en"
INFO	lowercase_definition	STATO:0000435	IAO:0000115	a probability distribution scale parameter is a measure of variation which is set by the operator when selecting a parametric probability distribution and which defines how spread the distribution is. The larger the value of the scale parameter is, the more spread out the distribution.@en
INFO	lowercase_definition	STATO:0000436	IAO:0000115	a probability distribution shape parameter is a data item which is set by the operator when selecting a parametric probability distribution and which dictates the way the profile but not the location or size of the distribution plot looks like.@en
INFO	lowercase_definition	STATO:0000437	IAO:0000115	a scale estimator is a measurement datum (a statistic) which is calculated to approach the actual scale parameter of a probability distribution from observed data.@en
INFO	lowercase_definition	STATO:0000438	IAO:0000115	a log-normal (or lognormal) distribution is a continuous probability distribution of a random variable whose logarithm is normally distributed. Thus, if the random variable {\displaystyle X} X is log-normally distributed, then {\displaystyle Y=\ln(X)} Y=\ln(X) has a normal distribution. Likewise, if {\displaystyle Y} Y has a normal distribution, then {\displaystyle X=\exp(Y)} X=\exp(Y) has a log-normal distribution. A random variable which is log-normally distributed takes only positive real values. The distribution is occasionally referred to as the Galton distribution or Galton's distribution, after Francis Galton.@en
INFO	lowercase_definition	STATO:0000439	IAO:0000115	outlier detection testing objective is a statistical objective of a data transformation which aims to test a null hypothesis that an observation is not an outlier.@en
INFO	lowercase_definition	STATO:0000444	IAO:0000115	"a split-plot design is kind of factorial design which is used when running a full factorial completely randomized design is inpractical, either for cost or practicalities (e.g. equipment, fields), in other words, when a restricted randomization has to be applied. A split-plot design is used whenever practioners fix the level of 'hard to change factor' and run all the combinations of the other factors. The hard to change factor is also refered to as the 'whole plot' factor, while the remainders of the factors are refered to as 'split plot factor'.
Performing a split-plot design therefore means fixing one factor level, and then applying the treatments formed by the cartesian products of the levels for the other factors. A mininum of 2 factors are required and one being applied before the other(s).@en"
INFO	lowercase_definition	STATO:0000445	IAO:0000115	a split split plot design is a study design where restricted randomization affect 2 study factors (and not 1 as in split-plot design). Such design is only possible if at least 3 independent variables are present.@en
INFO	lowercase_definition	STATO:0000447	IAO:0000115	"a 'whole plot number' is a data item used to count and identify the actual piece of land (in the case of real field based trials) used in a split plot design experiment and receiving treatments corresponding to the levels of a factor whose randomization is restricted (these factors are known as 'hard to change' factors).
In the case of non-field based trials, the 'whole plot' is a metaphor.@en"
INFO	lowercase_definition	STATO:0000448	IAO:0000115	"a 'sub plot number' is a data item used to count and identify the actual piece of land located within a 'whole plot', in the case of real field based trials using a split-plot design, and received completely randomized treatments corresponding to the factor levels combinations of the remainder factors declared in the experiment.

in the case of 'split-split plot design', sub-plots also receive treatments corresponding to a factor whose randomization is restriction. In such configuration, each 'sub-plot' is itself divided into 'sub sub-plot', which then received the remainder of the treatments in completely randomized fashion.

In the case of non-field based trials, the notion 'sub-plot' is a metaphor.@en"
INFO	lowercase_definition	STATO:0000449	IAO:0000115	"a 'sub sub-plot number' is a data item used to count and identify the actual piece of land located within a 'sub plot', in the case of real field based trials using a split-split-plot design, and received completely randomized treatments corresponding to the factor levels combinations of the remainder factors declared in the experiment.

in the case of 'split-split plot design', sub-plots also receive treatments corresponding to a factor whose randomization is restriction. In such configuration, each 'sub-plot' is itself divided into 'sub sub-plot', which then received the remainder of the treatments in completely randomized fashion.

In the case of non-field based trials, the notion 'sub sub-plot' is a metaphor.@en"
INFO	lowercase_definition	STATO:0000450	IAO:0000115	"\""Wilks' lambda is a test statistic used in multivariate analysis of variance (MANOVA) to test whether there are differences between the means of identified groups of subjects on a combination of dependent variables.\""@en"
INFO	lowercase_definition	STATO:0000452	IAO:0000115	"\""The Lawley–Hotelling trace is used to test the equality of mean vectors of k p‐variate normal distributions with common but unknown covariance matrix. The explicit form of the null distribution of T$_{0}^{2}$equation image is the F distribution. The asymptotic null distribution is the chi‐square distribution. The power function of the test is described and its power is compared with the likelihood ratio test. \""@en"
INFO	lowercase_definition	STATO:0000454	IAO:0000115	"\""The multivariate analysis of variance, or MANOVA, is a procedure for comparing multivariate sample means. As a multivariate procedure, it is used when there are two or more dependent variables, and is typically followed by significance tests involving individual dependent variables separately.

It helps to answer:
1. Do changes in the independent variable(s) have significant effects on the dependent variables?
2. What are the relationships among the dependent variables?
3. What are the relationships among the independent variables?\""@en"
INFO	lowercase_definition	STATO:0000459	IAO:0000115	group sequential design is a study design used in clinical trial settings in which interim analyses of the data are conducted after groups of patients are recruited. After each interim analysis, the trial may stop early if the evidence so far shows the new treatment is particularly effective or ineffective. Such designs are ethical and cost-effective, and so are of great interest in practice.@en
INFO	lowercase_definition	STATO:0000460	IAO:0000115	interim analysis is a data transformation used to analyzed studies implementing a group-sequential design, to evaluate and interpret the accumulating information during a clinical trial. It means that the analysis of data that is conducted before full data collection has been completed. Clinical trials are unusual in that enrollment of patients is a continual process staggered in time. This means that if a treatment is particularly beneficial or harmful compared to the concurrent placebo group while the study is on-going, the investigators are ethically obliged to assess that difference using the data at hand and to make a deliberate consideration of terminating the study earlier than planned.@en
INFO	lowercase_definition	STATO:0000461	IAO:0000115	"the O'brien-Flemming boundary analysis is a kind of interim-analysis method implemented by O'brien and Flemming to account for the


As all frequentist methods of the same type, it focuses on controlling the type I error rate as the repeated hypothesis testing of accumulating data increases the type I error rate of a clinical trial.@en"
INFO	lowercase_definition	STATO:0000467	IAO:0000115	the model random effect term is model term which aims to account for the unwanted variability in the data associated with a range of independent variables which are not the primary interest in the dataset. It is there also known as the variance component of the model
INFO	lowercase_definition	STATO:0000468	IAO:0000115	a model fixed effect term is a model term which accounts for variation explained by an independent variable and its levels.
INFO	lowercase_definition	STATO:0000469	IAO:0000115	a model interaction effect term is a model term which accounts for variation explained by the combined effects of the factor levels of more than one (usually 2) independent variables.
INFO	lowercase_definition	STATO:0000470	IAO:0000115	a model error term is a model term which accounts for residual variation not explained by the other components (fixed and random effect terms)
INFO	lowercase_definition	STATO:0000471	IAO:0000115	a estimate is a data item which is computed from a dataset to provide an approximated value (an estimator) for a 'statistical parameter' (a 'characteristics/parameter' of the true underlying distribution) of a real population.
INFO	lowercase_definition	STATO:0000475	IAO:0000115	a data transformation to determine the number of degree of freedom@en
INFO	lowercase_definition	STATO:0000478	IAO:0000115	a dataset which is made up of pedigree information, that is presenting ancestry or lineage information in a set of individuals of an organism.@en
INFO	lowercase_definition	STATO:0000481	IAO:0000115	a data transformation which calculate predictions of breeding values using an animal model and a relationship matrix calculated from the genomic/genetic markers (G Matrix), in constrast to using Pedigree information as in BLUP, also known as ABLUP
INFO	lowercase_definition	STATO:0000482	IAO:0000115	a data transformation which calculate estimates of genomic estimated breeding values (GEBVs) on an animal or plant model utilizing trait-specific marker information.@en
INFO	lowercase_definition	STATO:0000485	IAO:0000115	the estimated breeding value of an organism is a data item computed to estimate the true breeding value defined as genetic merit of an organism, half of which will be passed on to its progeny. While the exact breeding value can not been known, for performance traits it is possible to make good estimates. These estimates are called Estimated Breeding Values (EBVs). EBVs are expressed in the units of measurement for each particular trait. These estimates are output of various estimation methods which differ depending on the underlying assumptions (equal variance of marker effect, all markers contributing to the trait) , the mathemical methods used (bayesian or non-bayesians) and the genetic inheritance models being considered (additive, dominant, epistatic) selected by the analysts.@en
INFO	lowercase_definition	STATO:0000487	IAO:0000115	an additive genetic model is a data item which refers to the contributions to the final phenotype from more than one gene, or from alleles of a single gene (in heterozygotes), that combine in such a way that the sum of their effects in unison is equal to the sum of their effects individually.@en
INFO	lowercase_definition	STATO:0000488	IAO:0000115	an additive genetic model is a data item which refer to the contributions to the final phenotype from more than one gene, or from alleles of a single gene (in heterozygotes), that combine in such a way that the sum of their effects in unison is equal to the sum of their individual effects and their dominance effect (of alleles at a single locus).@en
INFO	lowercase_definition	STATO:0000489	IAO:0000115	an additive genetic model is a data item which refer to the contributions to the final phenotype from more than one gene, or from alleles of a single gene (in heterozygotes), that combine in such a way that the sum of their effects in unison is equal to the sum of their individual effect, their additive dominant (effect (of alleles at a single locus) and their epistasic effect (of alleles at more different loci).@en
INFO	lowercase_definition	STATO:0000493	IAO:0000115	a genotype matrix is a kind of genomic relationship matrix in the rawest of form and which simply corresponds to a matrix of individuals genotype for a given set of markers or genomic positions. Columns are snps or markers, Rows are individuals. Each column/row cell contains a genotype expressed as, in the genome is diploid, as a pair of characters chosen from ATGC where the dominant variant is uppercased and the recessive variant is lower cased.
INFO	lowercase_definition	STATO:0000494	IAO:0000115	the MAF matrix is a genomic relationship matrix which is obtained from the genotype matrix by counting the number of minor alleles at each locus@en
INFO	lowercase_definition	STATO:0000495	IAO:0000115	"the M matrix is a genomic relationship matrix which is obtained by subtracting 1 to every value of the MAF matrix (gene content matrix). The values of the M matrix are only -1, 0 or 1 and makes computation easier.
M = MAF-1"
INFO	lowercase_definition	STATO:0000497	IAO:0000115	the Z-matrix is a genomic relationship matrix which is obtained by substracted the M matrix with the P matrix. It is also known as the incidence matrix for the markers.
INFO	lowercase_definition	STATO:0000499	IAO:0000115	"augmented design is a kind of experimental design where the goal is to compare existing (control) treatments with new treatments that have an experimental constraint of \""limited replication\"". To understand limited replication, consider about experiments that may only allow a single representation of the new treatment, this limitation may be many times due to the cost associated with the experiment, limited resources, or limited number of new units that can be used in the experiment. In contrast, the existing treatments are referred as checks and are generally replicated multiple times. With augmented design one can estimate the following:

a) Differences between checks and new treatments,
b) Differences among new treatments,
c) Differences among check treatments, and
d) Differences among new and check treatments combined."
INFO	lowercase_definition	STATO:0000500	IAO:0000115	a probability distribution location parameter is a data item which is set by the operator when selecting a parametric probability distribution and which dictates the way the location but not the profile or size of the distribution plot looks like.@en
INFO	lowercase_definition	STATO:0000501	IAO:0000115	"the Weibull probability distribution is continuous probabibility distribution which is used to model time to fail, time to repair and material strength in material science. In biomedicine, the Weibull probability is used to in determining 'hazard functions'.

The 'location parameter' of the Weibull probability distribution can be used to define a failure-free zone.

If the quantity X is a \""time-to-failure\"", the Weibull distribution gives a distribution for which the failure rate is proportional to a power of time. The shape parameter, k, is that power plus one, and so this parameter can be interpreted directly as follows:
A value of {\displaystyle k<1\,} {\displaystyle k<1\,} indicates that the failure rate decreases over time. This happens if there is significant \""infant mortality\"", or defective items failing early and the failure rate decreasing over time as the defective items are weeded out of the population. In the context of the diffusion of innovations, this means negative word of mouth: the hazard function is a monotonically decreasing function of the proportion of adopters;
A value of {\displaystyle k=1\,} {\displaystyle k=1\,} indicates that the failure rate is constant over time. This might suggest random external events are causing mortality, or failure. The Weibull distribution reduces to an exponential distribution;
A value of {\displaystyle k>1\,} {\displaystyle k>1\,} indicates that the failure rate increases with time. This happens if there is an \""aging\"" process, or parts that are more likely to fail as time goes on. In the context of the diffusion of innovations, this means positive word of mouth: the hazard function is a monotonically increasing function of the proportion of adopters. The function is first concave, then convex with an inflexion point at {\displaystyle (e^{1/k}-1)/e^{1/k},k>1\,} {\displaystyle (e^{1/k}-1)/e^{1/k},k>1\,}.@en"
INFO	lowercase_definition	STATO:0000502	IAO:0000115	statistical sampling is a planned process which aims at assembling a population of observation units (samples) in as an unbiaised manner as possible in order to obtain or infer information about the actual population these samples have been drawn.
INFO	lowercase_definition	STATO:0000504	IAO:0000115	line intercept sampling is a sampling process by which an element in a spatial region is included in a sample if it is intersected by a line chosen by the operator.@en
INFO	lowercase_definition	STATO:0000508	IAO:0000115	stratified sampling is a statistical sampling method which divides the population into homogenous subpopulations, which are then sampled using random or systematic sampling methods
INFO	lowercase_definition	STATO:0000509	IAO:0000115	systematic sampling is a process for collecting samples and assembling a statistical sample using a system or method (.e.g unequal probabilities, without replacement, fixed sample size), as opposed to a random sampling.@en
INFO	lowercase_definition	STATO:0000517	IAO:0000115	complete randomization is a group randomization where experimental units are randomly assigned to the entire set of groups defined by the experimental treatments.
INFO	lowercase_definition	STATO:0000519	IAO:0000115	last observation carried forward data imputation is a type of data imputation which uses a very simple, self explanatory method for substituted a missing value for an observation. It should be noted that this method gives a biased estimate of the treatment effect and underestimates the variability of the estimated result and should be used cautiously.
INFO	lowercase_definition	STATO:0000520	IAO:0000115	regression data imputation is a type of data imputation where missing values are replaced with the value of a regression function coefficient.
INFO	lowercase_definition	STATO:0000521	IAO:0000115	substitution by the mean data imputation is a type of data imputation where missing values are replaced with the value the variable mean.
INFO	lowercase_definition	STATO:0000522	IAO:0000115	multivariate imputation with chained equations (MICE) is a type of data imputation which uses an algorithm devised by Stef van Buuren and Karin Groothuis-Oudshoorn
INFO	lowercase_definition	STATO:0000523	IAO:0000115	k-nearest neighbour imputation is a data imputation which uses the k-nearest neighbour algorithm to compute a substitution value for the missing values. For every observation to be imputed, it identifies ‘k’ closest observations based on the euclidean distance and computes the weighted average (weighted based on distance) of these ‘k’ obs.
INFO	lowercase_definition	STATO:0000525	IAO:0000115	a covariance matrix is a square matrix that contains the variances and covariances associated with several variables. The diagonal elements of the matrix contain the variances of the variables and the off-diagonal elements contain the covariances between all possible pairs of variables.@en
INFO	lowercase_definition	STATO:0000526	IAO:0000115	the numerator relationship matrix is the matrix of expected additive genetic relationships between individuals. This matrix was originally used by Henderson (Henderson, C.R. 1976. A simple method for computing the inverse of a numerator relationship matrix used in prediction of breeding values. Biometrics 32:69-83.) to account for covariances between random effects, and therefore to use information from relatives in estimation of breeding value. Among the properties of the NRM matrix (also known as the A matrix), it is symmetric, the diagonal value correspond to 1+ the inbreeding coefficient for an individual.
INFO	lowercase_definition	STATO:0000529	IAO:0000115	a scaled t distribution is a kind of Student's t distribution which is shifted by 'mean' and scaled by standard deviation 'sd'.
INFO	lowercase_definition	STATO:0000530	IAO:0000115	a Bayesian model is a statistical model where inference is based on using Bayes theorem to obtain a posterior distribution for a quantity (or quantities) of interest for some model (such as parameter values) based on some prior distribution for the relevant unknown parameters and the likelihood from the model.@en
INFO	lowercase_definition	STATO:0000531	IAO:0000115	a prior probability distribution is a probability distribution used as input to a Bayesian model to represent a priori knowledge about a model parameter. Along with the acquired/observed data, it is used to compute a posterior distribution according to the Bayes theorem.
INFO	lowercase_definition	STATO:0000532	IAO:0000115	a posterior probability distribution is a probability distribution computed in a Bayesian model approach given a prior distribution and a set of events/observations.
INFO	lowercase_definition	STATO:0000534	IAO:0000115	genetic inheritance model is a data item defining the assumption used by a breeding value estimation method to consider when running the calculations.
INFO	lowercase_definition	STATO:0000535	IAO:0000115	sampling from a probability distribution is a data transformation which aims at obtaining a sequence of random samples from a probability distribution for which direct sampling is difficult.
INFO	lowercase_definition	STATO:0000537	IAO:0000115	the Metropolis–Hastings algorithm is a Markov chain Monte Carlo (MCMC) method for obtaining a sequence of random samples from a probability distribution for which direct sampling is difficult.
INFO	lowercase_definition	STATO:0000538	IAO:0000115	a continuous multivariate probability distribution is a continuous probability distribution which describes the possible values, and corresponding probabilities, of two or more (usually three or more) associated random variables.
INFO	lowercase_definition	STATO:0000539	IAO:0000115	a discrete multivariate probability distribution is a discrete probability distribution which describes the possible values, and corresponding probabilities, of two or more (usually three or more) associated random variables.
INFO	lowercase_definition	STATO:0000541	IAO:0000115	a state space model is a kind of statistical model which describes the probabilistic dependence between the latent state variable and the observed measurement. The state or the measurement can be either continuous or discrete. The term “state space” originated in 1960s in the area of control engineering (Kalman, 1960). SSM provides a general framework for analyzing deterministic and stochastic dynamical systems that are measured or observed through a stochastic process.
INFO	lowercase_definition	STATO:0000542	IAO:0000115	genomic estimated breeding value (GEBV) is an estimated breeding value derived from information in an organism DNA (genotype). GEBV is calculated differently to conventional Estimated Breeding Values using advanced modeling technique to deal with high dimensionality data.
INFO	lowercase_definition	STATO:0000549	IAO:0000115	random forest procedure is a type of data transformation used in classification and statistical learning using regression. The random forest procedure is a meta estimator that fits a number of classifying decision trees on various sub-samples of the dataset (it operates by constructing a multitude of decision trees at training time) and use averaging to improve the predictive accuracy and control over-fitting. The sub-sample size is always the same as the original input sample size but the samples are drawn with replacement if bootstrap=True (default). The random forest procedure outputs the class that is the mode of the classes (classification) or mean prediction (regression) of the individual trees.@en
INFO	lowercase_definition	STATO:0000550	IAO:0000115	"log likelihood is a data item which corresponds to the natural logarithm of the likelihood.
log likelihood is a data item commonly used to provide a measure of accuracy of a model."
INFO	lowercase_definition	STATO:0000554	IAO:0000115	number of cross-validation segments is a count which is used as input parameter in a cross validation procedure to evaluate a statistical model.
INFO	lowercase_definition	STATO:0000555	IAO:0000115	number of predictive components is a count used as input to the principle component analysis (PCA)
INFO	lowercase_definition	STATO:0000556	IAO:0000115	number of orthogonal components is a count used as input to the orthogonal partial least square discriminant analysis (OPLS-DA)
INFO	lowercase_definition	STATO:0000557	IAO:0000115	computed_from is a relation between 2 information content entity denoting how one is derived from another on through the application of a data transformation or computation process.@en
INFO	lowercase_definition	STATO:0000559	IAO:0000115	"the Wald test is statistical test which computes a Wald chi-squared test for 1 or more coefficients, given their variance-covariance matrix.
The Wald test (also called the Wald Chi-Squared Test) is a way to find out if explanatory variables in a model are significant. “Significant” means that they add something to the model; variables that add nothing can be deleted without affecting the model in any meaningful way"
INFO	lowercase_definition	STATO:0000560	IAO:0000115	the Rao-Scott score is a statistic which is used to test the hypothesis that all coefficients associated with a particular regression term are zero (or have some other specified values). the LRT uses a linear combination of chi-squared distributions
INFO	lowercase_definition	STATO:0000561	IAO:0000115	"the frequency (i.e., the proportion) of possible confidence intervals that contain the true value of their corresponding parameter. In other words, if confidence intervals are constructed using a given confidence level in an infinite number of independent experiments, the proportion of those intervals that contain the true value of the parameter will match the confidence level.
A probability measure of the reliability of an inferential statistical test that has been applied to sample data and which is provided along with the confidence interval for the output statistic."
INFO	lowercase_definition	STATO:0000565	IAO:0000115	"a regression coefficient is a measure of association that is used as the coefficient of an independent variable in a regression model, of the dependent variable, which is linear in its parameters.

A value of zero means no association. The sign (positive or negative) reflects the direction of association.

a regression coefficient is a measure of association generated by a type of data transformation called a regression, which aims to model a response variable by expression the predictor variables as part of a function where variable terms are modified by a number. A regression coefficient is one such number.@en"
INFO	lowercase_definition	STATO:0000572	IAO:0000115	a version of PLS used for classification, where the input y-block are group labels (categorical variable) rather than a continuous variable@en
INFO	lowercase_definition	STATO:0000575	IAO:0000115	a data transformation which finds principal component by applying non-linear iterative partial least squares algorithm
INFO	lowercase_definition	STATO:0000577	IAO:0000115	a partial least square regression applied when there is only one variable in Y (the matrix of response variables), or it is desirable to model and optimize separately the performance of each of the variables in Y. This case is usually referred to as PLS1 regression (J = 1).
INFO	lowercase_definition	STATO:0000578	IAO:0000115	a partial least square regression applied to a multivariate response variable.
INFO	lowercase_definition	STATO:0000579	IAO:0000115	improved kernel PLS is a data transformation which implement a very fast kernel algorithm for updating PLS models in a recursive manner and for exponentially discounting past data.
INFO	lowercase_definition	STATO:0000580	IAO:0000115	variable importance in projection is a measure computed as part of a partial least square regression to accumulate the importance of each variable j being reflected by w from each component.
INFO	lowercase_definition	STATO:0000581	IAO:0000115	"a data transformation which compute the singular-value decomposition of a rectangular matrix.
The singular-value decomposition is very general in the sense that it can be applied to any m × n matrix whereas eigenvalue decomposition can only be applied to certain classes of square matrices."
INFO	lowercase_definition	STATO:0000582	IAO:0000115	best linear unbiased estimator
INFO	lowercase_definition	STATO:0000583	IAO:0000115	"a completely randomized design is a type of design of experiment where the observation unit receive treatments (independent variable level) entirely at random. In other words, the observations unit are randomly assigned to treatments.
Completely randomized designs differ from randomized complete block design and should not be confused as in the latter, a blocking variable is first use to assign experimental units to blocks. Then only, the members of each block are then randomly assigned to different treatment groups"
INFO	lowercase_definition	STATO:0000584	IAO:0000115	"the Wald statistic is a statistic is used during a Wald test, a test of significance of the regression coefficient; it is based on the asymptotic normality property of maximum likelihood estimates, and is computed as:

W = b * 1/Var(b) * b

In this formula, b stands for the parameter estimates, and Var(b) stands for the asymptotic variance of the parameter estimates. The Wald statistic is tested against the Chi-square distribution in the Wald test."
INFO	lowercase_definition	STATO:0000585	IAO:0000115	degree of freedom calculation is a data transformation which is part of a stastical test and which aims to determine or estimate the number of degrees of freedom in a system.
INFO	lowercase_definition	STATO:0000586	IAO:0000115	a restricted randomized design is a kind of study design which uses randomization to allocate observation unit to treatment but where intuitively poor allocations of treatments to experimental units are avoided, while retaining the theoretical benefits of randomization. This is often the case when so-called 'hard to change' factors are used in an experimental design.@en
INFO	lowercase_definition	STATO:0000587	IAO:0000115	"the percentage of variance is an output of principal component analysis (PCRA), which is obtained by forming the ratio of an eigen-value divided by the sum of all eigen-values. This produces a \""percentage of variance\"" for each eigen-vector."
INFO	lowercase_definition	STATO:0000588	IAO:0000115	the scaled identity covariance structure is a type of covariance structure which has constant variance. The assumption is that there is no correlation between any elements.
INFO	lowercase_definition	STATO:0000590	IAO:0000115	"median of the ratios corrected count is kind of count produced during an RNA-Seq data normalization procedure which corresponds to dividing counts by sample-specific size factors determined by median ratio of gene counts relative to geometric mean per gene.
It was first described by Anders and Huber in 2010 (https://doi.org/10.1186/gb-2010-11-10-r106)


Recommended use:

-\""median of the ratios corrected count\"" is suited for Differential Expression analysis or between samples.

-\""median of the ratios corrected count\"" is NOT suited for gene count comparisons within a sample."
INFO	lowercase_definition	STATO:0000592	IAO:0000115	the Rand index is a ratio, related to the notion of accuracy (STATO_0000415), which is used to compare the similarity of two clustering outcomes.@en
INFO	lowercase_definition	STATO:0000593	IAO:0000115	the adjusted Rand index is a measure which rescales the Rand index, taking into account that random chance will cause some objects to occupy the same clusters, so the Rand Index will never actually be zero.@en
INFO	lowercase_definition	STATO:0000594	IAO:0000115	"a confusion matrix is a 2 by 2 contingency table used to evaluate the performance of a classifier, often a machine-learning classifier and that allows visualization of the performance of an algorithm, typically a supervised learning one. It defines two dimensions (\""actual\"" and \""predicted\""), and identical sets of \""classes\"" in both dimensions (each combination of dimension and class is a variable in the contingency table).@en"
INFO	lowercase_definition	STATO:0000595	IAO:0000115	the number of true positive is a count which denotes how many elements are correctly classified as having a feature they are actually known to be having (e.g. carrier of a pathogen).
INFO	lowercase_definition	STATO:0000596	IAO:0000115	the number of false positive is a count which denotes how many elements known to be void of feature are wrongly classified as having it (e.g. being diagnosed with a disease when one is totally healthy)
INFO	lowercase_definition	STATO:0000597	IAO:0000115	the number of true negative is a count which denotes how many elements are correctly classified as void of a feature they are actually known to be missing (e.g. free of pathogen).
INFO	lowercase_definition	STATO:0000598	IAO:0000115	the number of false negative is a count which denotes how many elements known to be having a feature are wrongly classified as being devoided of it (e.g. being given an all clear while being actually infected and carrier of a pathogen)
INFO	lowercase_definition	STATO:0000599	IAO:0000115	a point estimate is a data item which provides a particular value evaluating a population parameter@en
INFO	lowercase_definition	STATO:0000600	IAO:0000115	an interval estimate is a data item corresponding to a range of values likely to contain the population parameter of interest@en
INFO	lowercase_definition	STATO:0000601	IAO:0000115	simultaneous multiple testing method is a multiple testing correction method which...
INFO	lowercase_definition	STATO:0000602	IAO:0000115	sequential multiple testing method is a multiple testing correction method which...
INFO	lowercase_definition	STATO:0000603	IAO:0000115	a sequential multiple correction procedure which does not maintain a constant false positive rate but allows it to grow controllably.
INFO	lowercase_definition	STATO:0000604	IAO:0000115	a type of sequential multiple testing correction method
INFO	lowercase_definition	STATO:0000605	IAO:0000115	a type of sequential multiple testing correction
INFO	lowercase_definition	STATO:0000606	IAO:0000115	q is the basic statistic for the studentized range distribution, which is used for multiple comparison procedures, such as the single step procedure Tukey's range test, the Newman–Keuls method, and the Duncan's step down procedure, and establishing confidence intervals that are still valid after data snooping has occurred
INFO	lowercase_definition	STATO:0000607	IAO:0000115	a proportion is a ratio which corresponds to the fraction of the total presenting a particular feature
INFO	lowercase_definition	STATO:0000609	IAO:0000115	@en
INFO	lowercase_definition	STATO:0000610	IAO:0000115	measure of association is a statistic which quantitatively represents a relationship between two or more variables@en
INFO	lowercase_definition	STATO:0000611	IAO:0000115	"measure of correlation is a measure of association between ordinal or continuous variables.

A value of 0 means no association.
A positive value means a positive association (as one variable increases, the other variable increases).
A negative value means a negative association (as one variable increases, the other variable decreases).
For correlation coefficients, the possible values range from +1 (perfect positive association) to -1 (perfect negative association)@en"
INFO	lowercase_definition	STATO:0000621	IAO:0000115	diagnostic yield is a proportion in which the numerator represents the correctly detected items within the denominator that represents all items tested.
INFO	lowercase_definition	STATO:0000622	IAO:0000115	ratio-based measure of association is a measure of association which relies on a quotient of 2 quantities to indicate the strength of the association.@en
INFO	lowercase_definition	STATO:0000627	IAO:0000115	"odds correspond to a ratio in which the numerator represents the probability that an event will occur and the denominator represents the probability that an event will not occur.

'Odds' and 'Odds ratio' are different terms. 'Odds' is a ratio of probabilities. 'Odds ratio' is a ratio of two different odds.

Odds are calculated as p / (1-p) where p is the probability of event occurrence. When p = 0, the odds = 0. When p = 1, the odds may be expressed as not calculable or as \""odds against = 0\"".

Odds may be expressed as p:(1-p). Odds may be expressed as p:q where q = 1-p. Odds may be expressed as a:b where a and b are multiples of p and (1-p). Examples of different expressions of the same odds include 3:2, 3/2, 0.6:0.4, 0.6/0.4, and 1.5.

Odds may be expressed as \""odds for\"" or \""odds in favor\"" (e.g. 1:5 for a \""3\"" on a 6-sided die) or \""odds against\"" (e.g. 5:1 against a \""3\"" on a 6-sided die).

The term \""betting odds\"" used in gambling that involves financial amounts in the formulation is not an \""Odds\"" in the definition of the Scientific Evidence Code System."
INFO	lowercase_definition	STATO:0000633	IAO:0000115	a cutoff is an information content entity that represents or sets the boundary at which something changes.
INFO	lowercase_definition	STATO:0000642	IAO:0000115	a matrix is a rectangular array of numbers, which are called entries of the matrix.
INFO	lowercase_definition	STATO:0000643	IAO:0000115	the sample variance is a variance computed over the actual observations made, which correspond to a sample drawn from a population in an experiment. the sample variance can be used to estimate the true variance of the underlying population/distribution.
INFO	lowercase_definition	STATO:0000646	IAO:0000115	"the population variance is a variance of the true population from which a sample is derived.
the population variance describes the variability of a characteristic in the population."
INFO	lowercase_definition	STATO:0000650	IAO:0000115	"sampling variance refers to the variability in the estimates of a population parameter that arises from random sampling.
sampling variance is a variance of the sampling distribution of a random variable and estimates the dispersion of sample estimates about their expected value in hypothetical repetitions of the sample."
INFO	lowercase_definition	STATO:0000651	IAO:0000115	the output of a statistical sampling, a draw from a distribution or a population of physical or immaterial entities.
INFO	lowercase_definition	STATO:0000652	IAO:0000115	calibration in statistics refers to the process of ensuring that the predicted probabilities or scores from a statistical model accurately reflect the true probabilities or outcomes observed in the data. It is an essential aspect of predictive modeling to ensure the reliability and interpretability of model predictions, where the goal is to estimate the likelihood of certain events or outcomes.
INFO	lowercase_definition	STATO:0000654	IAO:0000115	the objective of a data transformation to test a null hypothesis of absence of difference within subject holds.
INFO	lowercase_definition	STATO:0000655	IAO:0000115	calibration plot is a line graph which plots values resulting from predictions againts values obtained through observation.
INFO	lowercase_definition	STATO:0000656	IAO:0000115	"the slope of a line graph is a data item denoting the rate of change between the two variables represented on the graph.
the slope is used to visualize and interpret the relationship between two variables."
INFO	lowercase_definition	STATO:0000657	IAO:0000115	an intercept is a data item which corresponds to where a graph line cuts (intercepts) an coordinates axis.
INFO	lowercase_definition	STATO:0000696	IAO:0000115	a value-time curve is a graph which plot time on the x-axis versus the value of a variable of interest as delivered by a process. It is used to represent the relationship there is the value and the time it takes to achieve that value. It is commonly used in project management or business analysis to evaluate and optimize how efficiently value is delivered over time.@en
INFO	lowercase_definition	STATO:0000697	IAO:0000115	an homogeneity test is a statistical test aiming at evaluate if the statisical measure from several random samples are similar@en
INFO	lowercase_definition	STATO:0000701	IAO:0000115	a statistical test which test for homogeneity of proportions, which is used when comparing proportions observed across multiple groups. It relies on frequencies calculated in contingency tables. It determines is proportions are consistent.
INFO	missing_superclass	BFO:0000001	rdfs:subClassOf