To create a product that is really able to affect the parenthood of millions of families very important quality technology tested by time. Our team for many years strive to be market leaders through own unique developments
Interpreting an infant's emotional state based on their cry is critical in both medical and everyday contexts. Crying is the primary form of communication for infants and often contains bio-signals that reflect their needs and well-being. Our system applies a comprehensive approach to analyze these signals using multi-domain feature extraction, time-series visualization, and hierarchical classification powered by machine learning models.
Introduction
Our solution consists of a modular pipeline:
Acoustic segmentation — identifying cry episodes
Digital signal preprocessing — normalization, filtering, and background noise reduction
Multi-level feature extraction:
ZCR, RMS — time-domain dynamics
Mel-spectrogram, MFCC — spectral and cepstral features
TSI (Time Series Imaging) — converting MFCC into image representations
Machine learning classification — predicting the cry cause
System Architecture
Audio samples are recorded at 22,050 Hz and 5 seconds in duration. Preprocessing includes:
Pre-emphasis filtering
Windowing
Normalization
Amplitude equalization
Audio Signal Processing
ZCR (Zero Crossing Rate): rate of sign changes in the signal
RMS (Root Mean Square): measures the energy intensity of the waveform
Mel-Spectrogram: frequency representation scaled to human auditory perception
MFCC: extracted using DFT → Mel filterbank → log → DCT to simulate auditory encoding
Time-Domain & Frequency Features
To enhance classification accuracy, MFCC vectors are transformed into images:
GADF (Gramian Angular Difference Fields)
RP (Recurrence Plot)
MTF (Markov Transition Fields)
These images are used to train machine vision models.
Time-Series Imaging (TSI)
Results and Metrics
Model ensemble: - Random Forest - Support Vector Machine (SVM) - K-Nearest Neighbors (KNN) - BaggingClassifier
Our system integrates audio signal processing, time-series imaging, computer vision, and machine learning to interpret baby cries. It provides real-time insight for parents and holds future promise for pediatric diagnostics.
Conclusion
Try AYA For Free
AYA app available for free on the App Store & Google Play
Sign up for exclusive discounts, latest app info and helpful materials for parents