Performance Analysis of Feature Mel Frequency Cepstral Coefficient and Short Time Fourier Transform Input for Lie Detection using Convolutional Neural Network

Dewi Kusumawati - Hasanuddin of University, Gowa South Sulawesi, 92171 Indonesia
Amil Ahmad Ilham - Hasanuddin of University, Gowa South Sulawesi, 92171 Indonesia
Andani Achmad - Hasanuddin of University, Gowa South Sulawesi, 92171 Indonesia
Ingrid Nurtanio - Hasanuddin of University, Gowa South Sulawesi, 92171 Indonesia

Citation Format:



This study aims to determine which model is more effective in detecting lies between models with Mel Frequency Cepstral Coefficient (MFCC) and Short Time Fourier Transform (STFT) processes using Convolutional Neural Network (CNN). MFCC and STFT processes are based on digital voice data from video recordings that have been given lie or truth information regarding certain situations. Data is then pre-processed and trained on CNN. The results of model performance evaluation with hyper-tuning parameters and random search implementation show that using MFCC as Voice data processing provides better performance with higher accuracy than using the STFT process. The best parameters from MFCC are obtained with filter convolutional=64, kerneconvolutional1=5, filterconvolutional2=112, kernel convolutional2=3, filter convolutional3=32, kernelconvolutional3 =5, dense1=96, optimizer=RMSProp, learning rate=0.001 which achieves an accuracy of  97.13%, with an AUC value of 0.97. Using the STFT, the best parameters are obtained with filter convolutional1=96, kernel convolutional1=5, convolutional2 filters=48, convolutional2 kernels=5, convolutional3 filters=96, convolutional3 kernels=5, dense1=128, Optimizer=Adaddelta, learning rate=0.001, which achieves an accuracy of 95.39% with an AUC value of 0.95. Prosodics are used to compare the performance of MFCC and STFT. The result is that prosodic has a low accuracy of 68%. The analysis shows that using MFCC as the process of sound extraction with the CNN model produces the best performance for cases of lie detection using audio. It can be optimized for further research by combining CNN architectural models such as ResNet, AlexNet, and other architectures to obtain new models and improve lie detection accuracy.


MFCC; STFT; CNN; Detection; Lies; Parameters

Full Text:



A. R. Bhamare, S. Katharguppe, and J. Silviya Nancy, “Deep Neural Networks for Lie Detection with Attention on Bio-signals,” 2020 7th Int. Conf. Soft Comput. Mach. Intell. ISCMI 2020, pp. 143–147, 2020, doi: 10.1109/ISCMI51676.2020.9311575.

M. Zabcikova, Z. Koudelkova, and R. Jasek, “Concealed information detection using EEG for lie recognition by ERP P300 in response to visual stimuli: A review,” WSEAS Trans. Inf. Sci. Appl., vol. 19, pp. 171–179, 2022, doi: 10.37394/23209.2022.19.17.

A. Bablani, D. R. Edla, V. Kupilli, and R. Dharavath, “Lie Detection Using Fuzzy Ensemble Approach with Novel Defuzzification Method for Classification of EEG Signals,” IEEE Trans. Instrum. Meas., vol. 70, 2021, doi: 10.1109/TIM.2021.3082985.

Y. Zhou and F. Bu, “An Overview of Advancements in Lie Detection Technology in Speech,” Int. J. Inf. Technol. Syst. Approach, vol. 16, no. 2, pp. 1–24, 2023, doi: 10.4018/IJITSA.316935.

N. Baghel, D. Singh, M. K. Dutta, R. Burget, and V. Myska, “Truth Identification from EEG Signal by using Convolution neural network: Lie Detection,” 2020 43rd Int. Conf. Telecommun. Signal Process. TSP 2020, pp. 550–553, 2020, doi: 10.1109/TSP49548.2020.9163497.

S. Dodia, D. R. Edla, A. Bablani, and R. Cheruku, “Lie detection using extreme learning machine: A concealed information test based on short-time Fourier transform and binary bat optimization using a novel fitness function,” Comput. Intell., vol. 36, no. 2, pp. 637–658, 2020, doi: 10.1111/coin.12256.

A. Curci, T. Lanciano, F. Battista, S. Guaragno, and R. M. Ribatti, “Accuracy, confidence, and experiential criteria for lie detection through a videotaped interview,” Front. Psychiatry, vol. 9, no. January, pp. 1–14, 2019, doi: 10.3389/fpsyt.2018.00748.

V. Gupta, M. Agarwal, M. Arora, T. Chakraborty, R. Singh, and M. Vatsa, “Bag-of-lies: A multimodal dataset for deception detection,” IEEE Comput. Soc. Conf. Comput. Vis. Pattern Recognit. Work., vol. 2019-June, pp. 83–90, 2019, doi: 10.1109/CVPRW.2019.00016.

D. Pasquali, J. Gonzalez-Billandon, A. M. Aroyo, G. Sandini, A. Sciutti, and F. Rea, “Detecting Lies is a Child (Robot)’s Play: Gaze-Based Lie Detection in HRI,” Int. J. Soc. Robot., vol. 15, no. 4, pp. 583–598, 2021, doi: 10.1007/s12369-021-00822-5.

H. U. D. Ahmed, U. I. Bajwa, F. Zhang, and M. W. Anwar, “Deception Detection in Videos using the Facial Action Coding System,” pp. 0–2, 2021.

V. Pérez-Rosas, M. Abouelenien, R. Mihalcea, and M. Burzo, “Deception detection using real-life trial data,” in ICMI 2015 - Proceedings of the 2015 ACM International Conference on Multimodal Interaction, 2015, pp. 59–66. doi: 10.1145/2818346.2820758.

Z. Wu, B. Singh, L. S. Davis, and V. S. Subrahmanian, “Deception detection in videos,” 32nd AAAI Conf. Artif. Intell. AAAI 2018, pp. 1695–1702, 2018, doi: 10.1609/aaai.v32i1.11502.

H. Nasri, W. Ouarda, and A. M. Alimi, “ReLiDSS: Novel lie detection system from speech signal,” Proc. IEEE/ACS Int. Conf. Comput. Syst. Appl. AICCSA, vol. 0, 2016, doi: 10.1109/AICCSA.2016.7945789.

M. Delgado-Herrera, A. Reyes-Aguilar, and M. Giordano, “What Deception Tasks Used in the Lab Really Do: Systematic Review and Meta-analysis of Ecological Validity of fMRI Deception Tasks,” Neuroscience, vol. 468, pp. 88–109, 2021, doi: 10.1016/j.neuroscience.2021.06.005.

T. K. Ying-Li Tian and and Jeffrey F.Cohn, “Chapter 11. Facial Expression Analysis,” J. Infect. Dis., vol. 174, no. 4, pp. 835–838, 2013.

M. Arsal, B. Agus Wardijono, and D. Anggraini, “Face Recognition Untuk Akses Pegawai Bank Menggunakan Deep Learning Dengan Metode CNN,” J. Nas. Teknol. dan Sist. Inf., vol. 6, no. 1, pp. 55–63, 2020, doi: 10.25077/teknosi.v6i1.2020.55-63.

M. Owayjan, A. Kashour, N. Al Haddad, M. Fadel, and G. Al Souki, “The design and development of a lie detection system using facial micro-expressions,” 2012 2nd Int. Conf. Adv. Comput. Tools Eng. Appl. ACTEA 2012, pp. 33–38, 2012, doi: 10.1109/ICTEA.2012.6462897.

D. Barsever, S. Singh, and E. Neftci, “Building a Better Lie Detector with BERT: The Difference between Truth and Lies,” Proc. Int. Jt. Conf. Neural Networks, 2020, doi: 10.1109/IJCNN48605.2020.9206937.

V. Gupta, M. Agarwal, M. Arora, T. Chakraborty, R. Singh, and M. Vatsa, “Bag-of-lies: A multimodal dataset for deception detection,” IEEE Comput. Soc. Conf. Comput. Vis. Pattern Recognit. Work., vol. 2019-June, no. ii, pp. 83–90, 2019, doi: 10.1109/CVPRW.2019.00016.

I. Lakshan, L. Wickramasinghe, S. Disala, S. Chandrasegar, and P. S. Haddela, “Real Time Deception Detection for Criminal Investigation,” 2019 Natl. Inf. Technol. Conf. NITC 2019, pp. 8–10, 2019, doi: 10.1109/NITC48475.2019.9114422.

J. Immanuel, A. Joshua, and S. Thomas George, “A Study on Using Blink Parameters from EEG Data for Lie Detection,” in 2018 International Conference on Computer Communication and Informatics, ICCCI 2018, IEEE, 2018, pp. 1–5. doi: 10.1109/ICCCI.2018.8441238.

S. Kamran Haider, M. I. Daud, A. Jiang, and Z. Khan, “Evaluation of P300 based Lie Detection Algorithm,” Electr. Electron. Eng., vol. 2017, no. 3, pp. 69–76, 2017, doi: 10.5923/j.eee.20170703.01.

J. Gao, H. Tian, Y. Yang, X. Yu, C. Li, and N. Rao, “A novel algorithm to enhance P300 in single trials: Application to lie detection using F-score and SVM,” PLoS One, vol. 9, no. 11, 2014, doi: 10.1371/journal.pone.0109700.

B. Singh, P. Rajiv, and M. Chandra, “Lie detection using image processing,” ICACCS 2015 - Proc. 2nd Int. Conf. Adv. Comput. Commun. Syst., pp. 3–7, 2015, doi: 10.1109/ICACCS.2015.7324092.

W. Khan, K. Crockett, J. O’Shea, A. Hussain, and B. M. Khan, “Deception in the eyes of deceiver: A computer vision and machine learning based automated deception detection,” Expert Syst. Appl., vol. 169, no. February 2020, p. 114341, 2021, doi: 10.1016/j.eswa.2020.114341.

E. P. Fathima Bareeda, B. S. Shajee Mohan, and K. V. Ahammed Muneer, “Lie Detection using Speech Processing Techniques,” J. Phys. Conf. Ser., vol. 1921, no. 1, 2021, doi: 10.1088/1742-6596/1921/1/012028.

A. Kusnadi, I. M. O. Widyantara, and L. Linawati, “Deteksi Kebohongan Berdasarkan Fitur Fonetik Akustik,” Maj. Ilm. Teknol. Elektro, vol. 20, no. 1, p. 113, 2021, doi: 10.24843/mite.2021.v20i01.p13.

Y. Yohannes and R. Wijaya, “Klasifikasi Makna Tangisan Bayi Menggunakan CNN Berdasarkan Kombinasi Fitur MFCC dan DWT,” JATISI (Jurnal Tek. Inform. dan Sist. Informasi), vol. 8, no. 2, pp. 599–610, 2021, doi: 10.35957/jatisi.v8i2.470.

J. Li, X. Zhang, L. Huang, F. Li, S. Duan, and Y. Sun, “Speech Emotion Recognition Using a Dual-Channel Complementary Spectrogram and the CNN-SSAE Neutral Network,” Appl. Sci., vol. 12, no. 19, 2022, doi: 10.3390/app12199518.

N. D. Miranda, L. Novamizanti, S. Rizal, F. T. Elektro, and U. Telkom, “Convolutional Neural Network Pada Klasifikasi Sidik Jari Menggunakan Resnet-50 Classification of Fingerprint Pattern Using Convolutional Neural Network in Clahe Image,” J. Tek. Inform., vol. 1, no. 2, pp. 61–68, 2020, doi: DOI: 10.20884/1.jutif.2020.1.2.18.

F. Cai, L. Ma, Y. Lu, Y. Hu, and S. Su, “Combining Artificial Intelligence with Traditional Chinese Medicine for Intelligent Health Management,” … Mach. Learn., 2021.

N. Srivastava, G. Hinton, A. Krizhevsky, I. Sutskever, and R. Salakhutdinov, “Dropout: A simple way to prevent neural networks from overfitting,” J. Mach. Learn. Res., vol. 15, pp. 1929–1958, 2014.

A. B. Gumelar et al., “Human Voice Emotion Identification Using Prosodic and Spectral Feature Extraction Based on Deep Neural Networks,” 2019 IEEE 7th Int. Conf. Serious Games Appl. Heal. SeGAH 2019, pp. 1–8, 2019, doi: 10.1109/SeGAH.2019.8882461.