Statistical Manager Algorithms
From Gcube Wiki
Revision as of 09:13, 14 June 2016 by Gianpaolo.coro (Talk | contribs) (→Signal Processing Algorithms)
The complete list of algorithms supported by the Statistical Manager service is reported below.
Algorithms are clustered in the following categories: ... to be completed
- Clustering: ...
- Ecological Modeling: ...
- Signal Processing: ...
- Miscellaneos: algorithms not belonging to any of the above categories;
- A: ABSENCE_CELLS_FROM_AQUAMAPS;
- B: BIOCLIMATE_HCAF, BIOCLIMATE_HSPEC, BIOCLIMATE_HSPEN, BIONYM, BIONYM_BIODIV, BIONYM_LOCAL;
- F: FEED_FORWARD_ANN, FEED_FORWARD_A_N_N_DISTRIBUTION, FIN_GSAY_MATCH, FIN_TAXA_MATCH;
- G: GET_OCCURRENCES_ALGORITHM, GET_TAXA_ALGORITHM;
- H: HCAF_FILTER, HCAF_INTERPOLATION, HRS, HSPEN, HSPEN_FILTER
- T: TIMEEXTRACTION
- Z: ZETAEXTRACTION_TABLE
Clustering Algorithms
DBSCAN
| |
---|---|
Description | A clustering algorithm for real valued vectors that relies on the density-based spatial clustering of applications with noise (DBSCAN) algorithm. A maximum of 4000 points is allowed. |
A clustering algorithm for real value vectors that relies on the density-based spatial clustering of applications with noise (DBSCAN) algorithm. It accepts as input a table and some parameters characterising the expected result such as the epsilon and the minimum number of items in a cluster. It produces <output>. <limitation>. For more information see: Martin Ester, Hans-Peter Kriegel, Jörg Sander, Xiaowei Xu (1996-). "A density-based algorithm for discovering clusters in large spatial databases with noise". In Evangelos Simoudis, Jiawei Han, Usama M. Fayyad. Proceedings of the Second International Conference on Knowledge Discovery and Data Mining (KDD-96). AAAI Press. pp. 226–231. ISBN 1-57735-004-9. | |
Type | Clustering |
Execution | ... |
KMEANS
| |
Description | A clustering algorithm for real valued vectors that relies on the k-means algorithm, i.e. a method aiming to partition n observations into k clusters in which each observation belongs to the cluster with the nearest mean, serving as a prototype of the cluster. A Maximum of 4000 points is allowed. |
A clustering algorithm real value vectors that relies on the k-means algorithm, i.e. a method aiming to partition n observations into k clusters in which each observation belongs to the cluster with the nearest mean, serving as a prototype of the cluster. It accepts as input a table and some parameters characterising the result such as the number of expected clusters, the maximum number of iterations, the minimum number of points defining an outlier. It produces … . The implementation supports tables containing 4000 entries at maximum. For more information see: MacQueen, J. B. (1967). "Some Methods for classification and Analysis of Multivariate Observations". Proceedings of 5th Berkeley Symposium on Mathematical Statistics and Probability 1. University of California Press. pp. 281–297. MR 0214227. Zbl 0214.46201. Retrieved 2009-04-07. | |
Type | Clustering |
Execution | ... |
Ecological Modeling Algorithms
AQUAMAPSNN
| |
---|---|
Description | The AquaMaps model trained using a Feed Forward Neural Network. This is a method to train a generic Feed Forward Artifical Neural Network to be used by the AquaMaps Neural Network algorithm. Produces a trained neural network in the form of a compiled file which can be used later. |
A <type> algorithm that <what it does>. It accepts as input <input>. It produces <output>. <limitation>. For more information see: <citation/ref> | |
Type | Models |
Execution | ... |
AQUAMAPS_NATIVE
| |
Description | Algorithm for Native Distribution by AquaMaps. A distribution algorithm that generates a table containing species distribution probabilities on half-degree cells according to the AquaMaps approach for Native (Actual) distributions. |
A distribution algorithm that generates a table containing species distribution probabilities on half-degree cells according to the AquaMaps approach with suitable distribution. It accepts as input a table containing species envelops (HSPEN), a table containing environmental parameters (HCAF) and a table containing species occurrences points (half-degree cells). It produces a table containing species distribution probabilities. <limitation>. For more information see: Kesner-Reyes, K., K. Kaschner, S. Kullander, C. Garilao, J. Barile, and R. Froese. 2012. AquaMaps: algorithm and data sources for aquatic organisms. In: Froese, R. and D. Pauly. Editors. 2012. FishBase. World Wide Web electronic publication. www.fishbase.org, version (04/2012). | |
Type | Distributions |
Execution | ... |
AQUAMAPS_NATIVE_2050
| |
Description | Algorithm for Native 2050 Distribution by AquaMaps. A distribution algorithm that generates a table containing species distribution probabilities on half-degree cells according to the AquaMaps approach with native distribution estimated for 2050. |
Type | Distributions |
Execution | ... |
AQUAMAPS_NATIVE_NEURALNETWORK
| |
Description | Aquamaps Native Algorithm calculated by a Neural Network. A distribution algorithm that relies on Neural Networks and AquaMaps data for native distributions to generate a table containing species distribution probabilities on half-degree cells. |
Type | Distributions |
Execution | ... |
AQUAMAPS_SUITABLE
| |
Description | Algorithm for Suitable Distribution by AquaMaps. A distribution algorithm that generates a table containing species distribution probabilities on half-degree cells according to the AquaMaps approach for suitable (potential) distributions. |
Type | Distributions |
Execution | ... |
AQUAMAPS_SUITABLE_2050
| |
Description | Algorithm for Suitable 2050 Distribution by AquaMaps. A distribution algorithm that generates a table containing species distribution probabilities on half-degree cells according to the AquaMaps approach for suitable (potential) distributions for the 2050 scenario. |
Type | Distributions |
Execution | ... |
AQUAMAPS_SUITABLE_NEURALNETWORK
| |
Description | Aquamaps Algorithm for Suitable Environment calculated by Neural Network. A distribution algorithm that relies on Neural Networks and AquaMaps data for suitable distributions to generate a table containing species distribution probabilities on half-degree cells. |
Type | Distributions |
Execution | ... |
Signal Processing Algorithms
Time Series Analysis
| |
---|---|
Description | An algorithms applying signal processing to a non uniform time series. A maximum of 10000 distinct points in time is allowed to be processed. The process uniformly samples the series, then extracts hidden periodicities and signal properties. The sampling period is the shortest time difference between two points. Finally, by using Caterpillar-SSA the algorithm forecasts the Time Series. The output shows the detected periodicity, the forecasted signal and the spectrogram. |
Type | Time Series Analysis |
Execution | ... |
Miscellaneous Algorithms
ABSENCE_CELLS_FROM_AQUAMAPS
| |
---|---|
Description | An algorithm producing cells and features (HCAF) for a species containing absence points taken by an Aquamaps Distribution. |
A transducer algorithm that generates an Half-degree Cells Authority File (HCAF) dataset for species estimated absences points. It accepts as input a table xxx, a table xxx, the target species and the number of points to select. It produces an HCAF table containing environmental parameters on selected points. <limitation>. For more information see: Kesner-Reyes, K., K. Kaschner, S. Kullander, C. Garilao, J. Barile, and R. Froese. 2012. AquaMaps: algorithm and data sources for aquatic organisms. In: Froese, R. and D. Pauly. Editors. 2012. FishBase. World Wide Web electronic publication. www.fishbase.org, version (04/2012). | |
Type | Transducer |
Execution | Single machine |
BIOCLIMATE_HCAF
| |
Description | A transducer algorithm that generates an Half-degree Cells Authority File (HCAF) dataset for a certain time frame, with environmental parameters used by the AquaMaps approach. Evaluates the climatic changes impact on the variation of the ocean features contained in HCAF tables |
Type | Transducer |
Execution | Single machine |
BIOCLIMATE_HSPEC
| |
Description | A transducer algorithm that generates a table containing an estimate of species distributions per half-degree cell (HSPEC) in time. Evaluates the climatic changes impact on species presence. |
Type | Transducer |
Execution | Single machine |
BIOCLIMATE_HSPEN
| |
Description | A transducer algorithm that generates a table containing species envelops (HSPEN) in time, i.e. models capturing species tolerance with respect to environmental parameters, used by the AquaMaps approach. Evaluates the climatic changes impact on the variation of the salinity values in several ranges of a set of species envelopes |
Type | Transducer |
Execution | Single machine |
BIONYM
| |
Description | An algorithm implementing BiOnym, a flexible workflow approach to taxon name matching. The workflow allows to activate several taxa names matching algorithms and to get the list of possible transcriptions for a list of input raw species names with possible authorship indication. |
Type | ??? |
Execution | ??? |
BIONYM_BIODIV
| |
Description | An algorithm implementing BiOnym, a flexible workflow approach to taxon name matching. The workflow allows to activate several taxa names matching algorithms and to get the list of possible transcriptions for a list of input raw species names with possible authorship indication. |
Type | ??? |
Execution | ??? |
BIONYM_LOCAL
| |
Description | A fast version of the algorithm implementing BiOnym, a flexible workflow approach to taxon name matching. The workflow allows to activate several taxa names matching algorithms and to get the list of possible transcriptions for a list of input raw species names with possible authorship indication. |
Type | ??? |
Execution | Single machine |
DISCREPANCY_ANALYSIS
| |
Description | An evaluator algorithm that compares two tables containing real valued vectors. It drives the comparison by relying on a geographical distance threshold and a threshold for K-Statistic. |
An evaluator algorithm that compares two tables containing estimations of species occurrence by species and half-degree cell (HSPEC). It accepts as input the two tables and some parameters driving the comparison such as the comparison threshold and the threshold for K-Statistic. It produces <output>. <limitation>. For more information see: <citation/ref> | |
Type | Evaluator |
Execution | Single machine |
FEED_FORWARD_ANN
| |
Description | A method to train a generic Feed Forward Artificial Neural Network in order to simulate a function from the features space (R^n) to R. Uses the Back-propagation method. Produces a trained neural network in the form of a compiled file which can be used in the FEED FORWARD NEURAL NETWORK DISTRIBUTION algorithm. |
A modeling algorithm that relies on Neural Networks to <xxx>. It accepts as input a table containing the training dataset and some parameters affecting the algorithm behaviour such as the number of neurons, the learning threshold and the maximum number of iterations. It produces <output>. <limitation>. For more information see: <citation/ref> | |
Type | Models |
Execution | ??? |
FEED_FORWARD_A_N_N_DISTRIBUTION
| |
Description | A Bayesian method using a Feed Forward Neural Network to simulate a function from the features space (R^n) to R. A modeling algorithm that relies on Neural Networks to simulate a real valued function. It accepts as input a table containing the training dataset and some parameters affecting the algorithm behaviour such as the number of neurons, the learning threshold and the maximum number of iterations. |
Type | Distribution |
Execution | ??? |
FIN_GSAY_MATCH
| |
Description | An algorithm for GSAy Matching with respect to the Fishbase database |
Type | Transducers |
Execution | Single Machine |
FIN_TAXA_MATCH
| |
Description | An algorithm for Taxa Matching with respect to the Fishbase database |
A transducer algorithm that compares a species nomenclature with the Fishbase database according to the TAXAMATCH approach. It accepts as input the species nomenclature (genus and species) and the comparison operators to use, e.g. equal, begins with, contains. It produces <output>. <limitation>. For more information see: Rees, T., 2008. Applications of fuzzy (approximate string) matching in taxonomic database searches, with an example multi-tiered approach. [Extended abstract]. Pp. 12-14 in Worcester, T., Bajona, L. & Branton, B. (eds): Proceedings of a Conference on Ocean Biodiversity Informatics, Bedford Institute of Oceanography, Dartmouth, Nova Scotia, 2-4 October 2007. Bedford Institute of Oceanography, 2008 (CSAS/SCCS Proceedings Series 2008/024). | |
Type | Transducers |
Execution | Single Machine |
GET_OCCURRENCES_ALGORITHM
| |
Description | An Algorithm that retrieves the occurrences from a data provided based on the given search options</Description> |
A transducer algorithm that produces a dataset of species occurrences for a set of target species by retrieving these from major data providers including GBIF and OBIS. It accepts as input a list of species names and parameters including the data provider to use and query expansion criteria. It produces a DarwinCore file with the occurrences. <limitation>. For more information see: <citation/ref> | |
Type | Transducers |
Execution | Single Machine |
GET_TAXA_ALGORITHM
| |
Description | An Algorithm that retrieves the taxon from a data provided based on the given search options</Description> |
A transducer algorithm that produces a dataset of species taxonomic information for a set of target species by retrieving these from major data providers including Catalogue of Life, OBIS, WoRMS. It accepts as input a list of species names and parameters including the data provider to use and query expansion criteria. It produces a DarwinCore file with the occurrences. It produces <output>. <limitation>. For more information see: <citation/ref> | |
Type | Transducers |
Execution | Single Machine |
HCAF_FILTER
| |
Description | An algorithm producing a HCAF table on a selected Bounding Box (default identifies Indonesia) |
A transducer algorithm that produces a version of an Half-degree Cells Authority File (HCAF) dataset with environmental parameters to be used by the AquaMaps approach for a target area. It accepts as input the table and the bounding box representing the target area. It produces <output>. <limitation>. For more information see: <citation/ref> | |
Type | Transducers |
Execution | Single Machine |
HCAF_INTERPOLATION
| |
Description | Evaluates the climatic changes impact on species presence |
A transducer algorithm that generates a number of Half-degree Cells Authority File (HCAF) dataset with environmental parameters to be used by the AquaMaps approach by interpolation. It accepts as input the HCAF table representing the starting case, the HCAF table representing the ending case and parameters affecting interpolation such as the number of tables to produce and the interpolation function to use, e.g. linear, parabolic. It produces <output>. <limitation>. For more information see: <citation/ref> | |
Type | Transducers |
Execution | Single Machine |
HRS
| |
Description | An evaluator algorithm that calculates the Habitat Representativeness Score, i.e. an indicator of the assessment of whether a specific survey coverage or another environmental features dataset, contains data that are representative of all available habitat variable combinations in an area. |
A evaluator algorithm that calculate the Habitat Representativeness Score, i.e. an indicator of the assessment of whether a specific survey coverage contains data that are representative of all available habitat variable combinations in an area. It accepts as input the target area, a table with positive case, a table with negative cases. It produces <output>. <limitation>. For more information see: Colin D. MacLeod (2010). Habitat representativeness score (HRS): a novel concept for objectively assessing the suitability of survey coverage for modelling the distribution of marine species. Journal of the Marine Biological Association of the United Kingdom, 90, pp 1269-1277. doi:10.1017/S0025315410000408. | |
Type | Evaluators |
Execution | Single Machine |
HSPEN
| |
Description | The AquMaps HSPEN algorithm. A modeling algorithm that generates a table containing species envelops (HSPEN), i.e. models capturing species tolerance with respect to environmental parameters, to be used by the AquaMaps approach. |
A modeling algorithm that generates a table containing species envelops (HSPEN), i.e. models capturing species tolerance with respect to environmental parameters, to be used by the AquaMaps approach. It accepts as input a starting version of the HSPEN table, a table containing Half-degree Cells Authority File (HCAF) dataset with environmental parameters, and a table containing species occurrences data in half-deegree cells. It produces <output>. <limitation>. For more information see: Kesner-Reyes, K., K. Kaschner, S. Kullander, C. Garilao, J. Barile, and R. Froese. 2012. AquaMaps: algorithm and data sources for aquatic organisms. In: Froese, R. and D. Pauly. Editors. 2012. FishBase. World Wide Web electronic publication. www.fishbase.org, version (04/2012). | |
Type | Models |
Execution | Single Machine |
HSPEN_FILTER
| |
Description | An algorithm producing a HSPEN table containing only the selected species |
A transducer algorithm that generates a table containing species envelops (HSPEN), i.e. models capturing species tolerance with respect to environmental parameters, to be used by the AquaMaps approach for a set of target species. It accepts as input a starting version of the HSPEN table and a list of target species. It produces <output>. <limitation>. For more information see: Kesner-Reyes, K., K. Kaschner, S. Kullander, C. Garilao, J. Barile, and R. Froese. 2012. AquaMaps: algorithm and data sources for aquatic organisms. In: Froese, R. and D. Pauly. Editors. 2012. FishBase. World Wide Web electronic publication. www.fishbase.org, version (04/2012). | |
Type | Transducers |
Execution | Single Machine |
TIMEEXTRACTION
| |
Description | An algorithm to extract a time series of values associated to a geospatial features repository (e.g. NETCDF, ASC, GeoTiff files etc. ). The algorithm analyses the time series and automatically searches for hidden periodicities. It produces one chart of the time series, one table containing the time series values and possibly the spectrogram. |
Type | Transducer |
Execution | Single machine |
ZETAEXTRACTION_TABLE
| |
Description | An algorithm to extract a time series of values associated to a table containing geospatial information. The algorithm analyses the time series and automatically searches for hidden periodicities. It produces one chart of the time series, one table containing the time series values and possibly the spectrogram. |
Type | Transducer |
Execution | Single machine |