Difference between revisions of "Signal Processing"

From Gcube Wiki
Jump to: navigation, search
(Software)
Line 1: Line 1:
Signal Processing is a set of facilities that aim to analyze signals, or measure time-varying or spatially varying physical quantities. It is part of the gCube system facilities for Data Mining and Processing. It is especially used in order to to discover seasonality and periodicity in time series of observations. Such observations can refer to catch statistics, species presence occurrence or environmental parameters modulations. The Signal Processing facilities are included in the '''Ecological Engine''' gCube library. Such library is responsible for hosting all the basic data processing and mining for on biological and environmental datasets.  
+
Signal Processing is a set of facilities that aim to analyze signals or measure time-varying or spatially varying physical quantities. It is part of the gCube-system facilities for Data Mining and Processing. It is especially used in order to discover seasonality and periodicity in time series of real valued observations. Such observations can refer, for example, to catch statistics in fisheries, marine species presence occurrence or environmental parameters modulations. The Signal Processing facilities are part of the '''Ecological Engine''' gCube library. This library is responsible for hosting all the basic data processing and mining procedures for biological and environmental datasets.
 +
 
 
The Signal Processing facilities especially aim at facing the following issues:
 
The Signal Processing facilities especially aim at facing the following issues:
* Reconstruct a uniform sampled time series from a non-uniform time series
+
* Reconstruct a uniformly sampled time series from a non-uniform time series
 
* Perform Short-Time Standard Fourier Analysis
 
* Perform Short-Time Standard Fourier Analysis
 
* Trace the Spectrogram of a Time Series
 
* Trace the Spectrogram of a Time Series
* Highlight periodicity in Time Series  
+
* Highlight periodicity in a Time Series  
  
 
== Overview ==
 
== Overview ==
Signal Processing is used in many ways by the gCube based e-Infrastructures. GIS layers, containing geographical information, can report the variations of some environmental parameters in time. Information can be stored in NetCDF files as well as on remote GeoServers. Geographical maps can contain information about environmental parameters distribution or species distributions, but these are usually not uniformly defined. At some point, in time and space, values can be missing. For such reasons, data mining techniques are put together with Signal Processing facilities by the '''Ecological Engine''' library, in order to fill the gaps, reconstruct signals and produce time-frequency analysis.
+
Signal Processing is used in many ways by the gCube based e-Infrastructures. GIS layers, containing geographical information, can report the variations of some environmental parameters in time. Information can be stored in NetCDF files as well as on remote GeoServers. Geographical maps can contain information about environmental parameters distributions or species distributions, but these are usually not uniformly defined. At some point, in time and space, values can be missing. For such reasons, the '''Ecological Engine''' library puts together data mining techniques and Signal Processing facilities in order to fill the gaps, reconstruct signals and produce time-frequency analysis.
  
 
== Features ==
 
== Features ==
The features supported by the Signal Processing facilities include:
+
The features currently supported by the Signal Processing facilities include:
  
 
* signal reconstruction: rebuilds a time series which is not uniformly sampled in time;
 
* signal reconstruction: rebuilds a time series which is not uniformly sampled in time;
 
* spectrogram calculation and display: produces the spectrogram of a signal with the Short-Time Fourier Transform (STFT) technique, according to a certain sampling frequency and time-window shift;
 
* spectrogram calculation and display: produces the spectrogram of a signal with the Short-Time Fourier Transform (STFT) technique, according to a certain sampling frequency and time-window shift;
 
* multi-signal analysis by means of summed spectrogram: analyzes several synchronized signals and produces a spectrogram which is the sum of the single spectrograms;
 
* multi-signal analysis by means of summed spectrogram: analyzes several synchronized signals and produces a spectrogram which is the sum of the single spectrograms;
* delta + double delta features: produces the delta and double delta features, correlated to the first and second derivative of the signal;
+
* delta + double delta features: produces the delta and double delta features, related to the first and second derivative of the signal;
* center frequency calculation: calculated the central frequency in a filterbank;
+
* center frequency calculation: calculates the central frequency in a filterbank;
* cepstral coefficients calculation: calculated the cepstral coefficients of a signal, which store much of the information contained in the signal;
+
* cepstral coefficients calculation: calculates the cepstral coefficients of a signal, which store much of the information contained in the signal;
* spectrum frequency band cut: cuts the signal spectrum according to a certain bandwidth;
+
* spectrum frequency band cut: cuts the signal spectrum according to a certain frequency band;
* filterbanks: produces a filterbank for filtering the signal;
+
* filterbanks: produces a filterbank for filtering the signal in complex way;
 
* mel filterbanks: builds a perceptually inspired filterbank based on the mel frequencies distribution.
 
* mel filterbanks: builds a perceptually inspired filterbank based on the mel frequencies distribution.
  
 
A set of utilities are included in the '''Ecological Engine''' library in order to perform the above operations:
 
A set of utilities are included in the '''Ecological Engine''' library in order to perform the above operations:
* linear fequency to mel frequency tranformation
+
* linear frequency to mel frequency tranformation
 
* frequency to index in Short-Time Fourier Transform
 
* frequency to index in Short-Time Fourier Transform
 
* transformation to and from Rapid Miner Example Set
 
* transformation to and from Rapid Miner Example Set
Line 60: Line 61:
 
</source>
 
</source>
  
An example to call a signal reconstruction is:
+
An example which performs a signal reconstruction is:
  
 
<source lang="java">
 
<source lang="java">
Line 68: Line 69:
 
SignalProcessing.fillSignal(signal)
 
SignalProcessing.fillSignal(signal)
 
</source>
 
</source>
where  
+
 
 +
where the input parameters are defined in the following:
  
 
<source lang="java">
 
<source lang="java">
Line 75: Line 77:
 
</source>
 
</source>
  
The cfg directory and the Ecological Engine library are accessible at the following svn link: http://svn.research-infrastructures.eu/d4science/gcube/trunk/data-analysis/EcologicalEngine
+
The cfg directory and the Ecological Engine library are accessible at this svn link: http://svn.research-infrastructures.eu/d4science/gcube/trunk/data-analysis/EcologicalEngine
  
 
== Experiments ==
 
== Experiments ==

Revision as of 15:29, 3 May 2013

Signal Processing is a set of facilities that aim to analyze signals or measure time-varying or spatially varying physical quantities. It is part of the gCube-system facilities for Data Mining and Processing. It is especially used in order to discover seasonality and periodicity in time series of real valued observations. Such observations can refer, for example, to catch statistics in fisheries, marine species presence occurrence or environmental parameters modulations. The Signal Processing facilities are part of the Ecological Engine gCube library. This library is responsible for hosting all the basic data processing and mining procedures for biological and environmental datasets.

The Signal Processing facilities especially aim at facing the following issues:

  • Reconstruct a uniformly sampled time series from a non-uniform time series
  • Perform Short-Time Standard Fourier Analysis
  • Trace the Spectrogram of a Time Series
  • Highlight periodicity in a Time Series

Overview

Signal Processing is used in many ways by the gCube based e-Infrastructures. GIS layers, containing geographical information, can report the variations of some environmental parameters in time. Information can be stored in NetCDF files as well as on remote GeoServers. Geographical maps can contain information about environmental parameters distributions or species distributions, but these are usually not uniformly defined. At some point, in time and space, values can be missing. For such reasons, the Ecological Engine library puts together data mining techniques and Signal Processing facilities in order to fill the gaps, reconstruct signals and produce time-frequency analysis.

Features

The features currently supported by the Signal Processing facilities include:

  • signal reconstruction: rebuilds a time series which is not uniformly sampled in time;
  • spectrogram calculation and display: produces the spectrogram of a signal with the Short-Time Fourier Transform (STFT) technique, according to a certain sampling frequency and time-window shift;
  • multi-signal analysis by means of summed spectrogram: analyzes several synchronized signals and produces a spectrogram which is the sum of the single spectrograms;
  • delta + double delta features: produces the delta and double delta features, related to the first and second derivative of the signal;
  • center frequency calculation: calculates the central frequency in a filterbank;
  • cepstral coefficients calculation: calculates the cepstral coefficients of a signal, which store much of the information contained in the signal;
  • spectrum frequency band cut: cuts the signal spectrum according to a certain frequency band;
  • filterbanks: produces a filterbank for filtering the signal in complex way;
  • mel filterbanks: builds a perceptually inspired filterbank based on the mel frequencies distribution.

A set of utilities are included in the Ecological Engine library in order to perform the above operations:

  • linear frequency to mel frequency tranformation
  • frequency to index in Short-Time Fourier Transform
  • transformation to and from Rapid Miner Example Set
  • sinusoid signal generation
  • inverse mel calculation
  • sample to time and time to sample conversions
  • signal timeline generation
  • index to time conversion for spectrograms
  • time to index conversion for spectrograms

Software

The software is available on the gCube maven repository by linking the following component in the pom.xml file:

<dependency>
  <groupId>org.gcube.dataanalysis</groupId>
  <artifactId>ecological-engine</artifactId>
  <version>1.6.1-SNAPSHOT</version>
</dependency>

An example to call the spectrogram analysis with STFT and produce the chart is:

SignalConversions.spectrogram(name, signal, samplingRate, windowshift, frameslength, display)

Where the input variables are:

String name: the title of the chart
double[] signal: the sequence of values representing the trend
int samplingRate: the sampling frequency in integer value and multiple of 2
int windowshift: the window shift of the STFT in samples
int frameslength: the length of each window in samples
boolean display: a flag to ask the procedure to run an applet which displays the spectrogram

An example which performs a signal reconstruction is:

AlgorithmConfiguration config = new AlgorithmConfiguration();
config.setConfigPath(configDir);
config.initRapidMiner();
SignalProcessing.fillSignal(signal)

where the input parameters are defined in the following:

double[] signal: the sequence of values representing the trend
String configDir: a configuration folder containing the configuration files required by the Ecological Engine library

The cfg directory and the Ecological Engine library are accessible at this svn link: http://svn.research-infrastructures.eu/d4science/gcube/trunk/data-analysis/EcologicalEngine

Experiments

In the following experiments we give the idea of the transformations and processing that can be applied to signals with the Signal Processing (SP) facilities included in the 'Ecological Engine library. We selected a study area around Bari, Italy (ref. Fig. 1).

Figure 1. A study area around Bari, Italy.

We then extracted the temperature time series from a point in that area with coordinates (17.59;41.37). We downloaded a NetCDF file from the MyOceans repository, which contained mean monthly variations for the temperature between 2000 and 2010. By using the geographical extraction facilities of gCube for the NetCDF files we extracted the time trend of the temperature. By using the Signal Processing facilities we produced the chart in Fig. 2.

Figure 2. Variation of sea surface temperature between the years 2000 and 2010 in the point (17.59;41.37).

We used the spectrogram generation facility to produce the spectrogram plot of the signal (ref. Fig. 3). A continuous line is evident at ... Hz, which corresponds approximately to one year.

Figure 3. Spectrogram of the monthly temperature trend between the years 2000 and 2010 in the point (17.59;41.37).

The simple spectrogram produced by the STFT was then able to underline a hidden periodicity in the trend.

A further example, we report the usage of the SP facilities on a signal that is not uniformly sampled in time. We took the trend of some earthquakes report for the region of Garfagnana (Tuscany, Italy). The points are not equally spaced in time, which means that the trend is not uniformly sampled.

Figure 4. Trend of the earthquakes in Garfagnana at the beginning of 2013. The trend is not uniformly sampled.

By applying a K-Nearest Neibor data mining process, we were able to reconstruct the signal at the missing points. We simulated a sampling of 4 minutes and eventually obtained the trend in Fig. 5.

Figure 5. Reconstructed trend of the earthquakes with 4 minutes of time sampling.

At such point we could produce the spectrogram of the reconstructed signal. The surprising fact is that it highlights three well defined periods hidden in the first part of the signal. The correspondent frequencies are superposed (ref. Fig. 4).

Figure 6. Spectrogram of the earthquakes trend. Three hidden periods are detected by the STFT.