{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,3,24]],"date-time":"2026-03-24T22:22:22Z","timestamp":1774390942422,"version":"3.50.1"},"reference-count":46,"publisher":"MDPI AG","issue":"10","license":[{"start":{"date-parts":[[2022,10,19]],"date-time":"2022-10-19T00:00:00Z","timestamp":1666137600000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"funder":[{"name":"Natixis"},{"name":"Institut Europlace de Finance"},{"name":"Laboratoire de Probabilit\u00e9s"},{"name":"Statistique et Mod\u00e9lisation (LPSM)\/Universit\u00e9 Paris Cit\u00e9"},{"name":"Cr\u00e9dit Agricole CIB"}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["Algorithms"],"abstract":"<jats:p>A major concern when dealing with financial time series involving a wide variety of market risk factors is the presence of anomalies. These induce a miscalibration of the models used to quantify and manage risk, resulting in potential erroneous risk measures. We propose an approach that aims to improve anomaly detection in financial time series, overcoming most of the inherent difficulties. Valuable features are extracted from the time series by compressing and reconstructing the data through principal component analysis. We then define an anomaly score using a feedforward neural network. A time series is considered to be contaminated when its anomaly score exceeds a given cutoff value. This cutoff value is not a hand-set parameter but rather is calibrated as a neural network parameter throughout the minimization of a customized loss function. The efficiency of the proposed approach compared to several well-known anomaly detection algorithms is numerically demonstrated on both synthetic and real data sets, with high and stable performance being achieved with the PCA NN approach. We show that value-at-risk estimation errors are reduced when the proposed anomaly detection model is used with a basic imputation approach to correct the anomaly.<\/jats:p>","DOI":"10.3390\/a15100385","type":"journal-article","created":{"date-parts":[[2022,10,19]],"date-time":"2022-10-19T20:32:23Z","timestamp":1666211543000},"page":"385","update-policy":"https:\/\/doi.org\/10.3390\/mdpi_crossmark_policy","source":"Crossref","is-referenced-by-count":35,"title":["Anomaly Detection in Financial Time Series by Principal Component Analysis and Neural Networks"],"prefix":"10.3390","volume":"15","author":[{"given":"St\u00e9phane","family":"Cr\u00e9pey","sequence":"first","affiliation":[{"name":"CNRS, Laboratoire de Probabilit\u00e9s, Statistique et Mod\u00e9lisation (LPSM), Team MathFiPronum, Universit\u00e9 Paris Cit\u00e9, 75013 Paris, France"}]},{"given":"Noureddine","family":"Lehdili","sequence":"additional","affiliation":[{"name":"Natixis Entreprise Risk Management Department, 75013 Paris, France"}]},{"given":"Nisrine","family":"Madhar","sequence":"additional","affiliation":[{"name":"CNRS, Laboratoire de Probabilit\u00e9s, Statistique et Mod\u00e9lisation (LPSM), Team MathFiPronum, Universit\u00e9 Paris Cit\u00e9, 75013 Paris, France"}]},{"ORCID":"https:\/\/orcid.org\/0000-0003-1369-055X","authenticated-orcid":false,"given":"Maud","family":"Thomas","sequence":"additional","affiliation":[{"name":"CNRS, Laboratoire de Probabilit\u00e9s, Statistique et Mod\u00e9lisation (LPSM), Team Statistics, Sorbonne Universit\u00e9, 75005 Paris, France"}]}],"member":"1968","published-online":{"date-parts":[[2022,10,19]]},"reference":[{"key":"ref_1","unstructured":"Basel Committee on Banking Supervision (2013). Consultative Document: Fundamental Review of the Trading Book: A Revised Market Risk Framework, Basel Committee on Banking Supervision."},{"key":"ref_2","doi-asserted-by":"crossref","unstructured":"Hawkins, D.M. (1980). Identification of Outliers, Springer.","DOI":"10.1007\/978-94-015-3994-4"},{"key":"ref_3","unstructured":"Cheng, Y., Diakonikolas, I., Ge, R., and Woodruff, D. (2019). Faster algorithms for high-dimensional robust covariance estimation. arXiv."},{"key":"ref_4","unstructured":"Goodfellow, I., Bengio, Y., and Courville, A. (2016). Deep Learning, MIT Press."},{"key":"ref_5","doi-asserted-by":"crossref","first-page":"00037","DOI":"10.1051\/itmconf\/20182300037","article-title":"Kernel density estimation and its application","volume":"23","year":"2018","journal-title":"ITM Web Conf."},{"key":"ref_6","unstructured":"Kingma, D.P., and Ba, J. (2014). Adam: A method for stochastic optimization. arXiv."},{"key":"ref_7","unstructured":"Le Guennec, A., Malinowski, S., and Tavenard, R. (2016, January 19\u201323). Data augmentation for time series classification using convolutional neural networks. Proceedings of the ECML\/PKDD Workshop on Advanced Analytics and Learning on Temporal Data, Riva del Garda, Italy."},{"key":"ref_8","doi-asserted-by":"crossref","unstructured":"Um, T.T., Pfister, F.M., Pichler, D., Endo, S., Lang, M., Hirche, S., Fietzek, U., and Kuli\u0107, D. (2017, January 13\u201317). Data augmentation of wearable sensor data for parkinson\u2019s disease monitoring using convolutional neural networks. Proceedings of the 19th ACM International Conference on Multimodal Interaction, Glasgow, UK.","DOI":"10.1145\/3136755.3136817"},{"key":"ref_9","unstructured":"Brownlee, J. (2020). Imbalanced Classification with Python: Better Metrics, Balance Skewed Classes, Cost-Sensitive Learning, Machine Learning Mastery."},{"key":"ref_10","doi-asserted-by":"crossref","unstructured":"Saito, T., and Rehmsmeier, M. (2015). The precision-recall plot is more informative than the ROC plot when evaluating binary classifiers on imbalanced datasets. PLoS ONE, 10.","DOI":"10.1371\/journal.pone.0118432"},{"key":"ref_11","doi-asserted-by":"crossref","unstructured":"Chinchor, N., and Sundheim, B.M. (1993, January 25\u201327). MUC-5 evaluation metrics. Proceedings of the Fifth Message Understanding Conference (MUC-5): Proceedings of a Conference, Baltimore, Maryland.","DOI":"10.3115\/1072017.1072023"},{"key":"ref_12","unstructured":"Van Rijsbergen, C. (1979). Information retrieval: Theory and practice. Data Base Systems: Joint IBM\/University of Newcastle Upon Tyne Seminar Held in the University Computing Laboratory, 4th\u20137th September, 1979, University of Newcastle Upon Tyne Computing Laboratory."},{"key":"ref_13","doi-asserted-by":"crossref","first-page":"336","DOI":"10.1037\/1082-989X.12.3.336","article-title":"Nonlinear principal components analysis: Introduction and application","volume":"12","author":"Linting","year":"2007","journal-title":"Psychol. Methods"},{"key":"ref_14","unstructured":"Akyildirim, E., Gambara, M., Teichmann, J., and Zhou, S. (2022). Applications of signature methods to market anomaly detection. arXiv."},{"key":"ref_15","unstructured":"Polson, N., Sokolov, V., and Xu, J. (2021). Deep Learning Partial Least Squares. arXiv."},{"key":"ref_16","doi-asserted-by":"crossref","unstructured":"Eichhorn, M., Bellini, T., and Mayenberger, D. (2021). Reverse Stress Testing in Banking: A Comprehensive Guide, De Gruyter.","DOI":"10.1515\/9783110647907"},{"key":"ref_17","unstructured":"Chandola, V. (2009). Anomaly Detection for Symbolic Sequences and Time Series Data. [Ph.D. Thesis, University of Minnesota]."},{"key":"ref_18","doi-asserted-by":"crossref","unstructured":"Zhang, J., and Paschalidis, I.C. (2017). Statistical Anomaly Detection via Composite Hypothesis Testing for Markov Models. arXiv.","DOI":"10.1109\/TSP.2017.2771722"},{"key":"ref_19","doi-asserted-by":"crossref","first-page":"2463","DOI":"10.1109\/TPAMI.2020.2970410","article-title":"Real-time nonparametric anomaly detection in high-dimensional settings","volume":"43","author":"Kurt","year":"2020","journal-title":"IEEE Trans. Pattern Anal. Mach. Intell."},{"key":"ref_20","doi-asserted-by":"crossref","unstructured":"Laptev, N., Amizadeh, S., and Flint, I. (2015, January 10\u201313). Generic and scalable framework for automated time-series anomaly detection. Proceedings of the 21th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, Sydney, NSW, Australia.","DOI":"10.1145\/2783258.2788611"},{"key":"ref_21","doi-asserted-by":"crossref","unstructured":"Gao, J., and Tan, P.N. (2006, January 18\u201322). Converting output scores from outlier detection algorithms into probability estimates. Proceedings of the Sixth International Conference on Data Mining (ICDM 06), Hong Kong, China.","DOI":"10.1109\/ICDM.2006.43"},{"key":"ref_22","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1186\/s12859-019-3255-x","article-title":"Learning misclassification costs for imbalanced classification on gene expression data","volume":"20","author":"Lu","year":"2019","journal-title":"BMC Bioinform."},{"key":"ref_23","doi-asserted-by":"crossref","first-page":"549","DOI":"10.1109\/LSP.2009.2017477","article-title":"Snake validation: A PCA-based outlier detection method","volume":"16","author":"Saha","year":"2009","journal-title":"IEEE Signal Process. Lett."},{"key":"ref_24","doi-asserted-by":"crossref","unstructured":"Wen, Q., Sun, L., Yang, F., Song, X., Gao, J., Wang, X., and Xu, H. (2020). Time series data augmentation for deep learning: A survey. arXiv.","DOI":"10.24963\/ijcai.2021\/631"},{"key":"ref_25","unstructured":"Cui, Z., Chen, W., and Chen, Y. (2016). Multi-scale convolutional neural networks for time series classification. arXiv."},{"key":"ref_26","unstructured":"Gao, J., Song, X., Wen, Q., Wang, P., Sun, L., and Xu, H. (2020). Robusttad: Robust time series anomaly detection via decomposition and convolutional neural networks. arXiv."},{"key":"ref_27","unstructured":"Esteban, C., Hyland, S.L., and R\u00e4tsch, G. (2017). Real-valued (medical) time series generation with recurrent conditional gans. arXiv."},{"key":"ref_28","doi-asserted-by":"crossref","unstructured":"Kondratyev, A., Schwarz, C., and Horvath, B. (2020). Data anonymisation, outlier detection and fighting overfitting with restricted Boltzmann machines. Outlier Detection and Fighting Overfitting with Restricted Boltzmann Machines, SSRN.","DOI":"10.2139\/ssrn.3526436"},{"key":"ref_29","doi-asserted-by":"crossref","first-page":"235","DOI":"10.1613\/jair.3623","article-title":"Toward supervised anomaly detection","volume":"46","author":"Kloft","year":"2013","journal-title":"J. Artif. Intell. Res."},{"key":"ref_30","unstructured":"Ruff, L., Vandermeulen, R.A., G\u00f6rnitz, N., Binder, A., M\u00fcller, E., M\u00fcller, K.R., and Kloft, M. (2019). Deep semi-supervised anomaly detection. arXiv."},{"key":"ref_31","doi-asserted-by":"crossref","unstructured":"Zhao, Y., and Hryniewicki, M.K. (2018, January 8\u201313). XGBOD: Improving supervised outlier detection with unsupervised representation learning. Proceedings of the 2018 International Joint Conference on Neural Networks (IJCNN), Rio de Janeiro, Brazil.","DOI":"10.1109\/IJCNN.2018.8489605"},{"key":"ref_32","doi-asserted-by":"crossref","first-page":"1454","DOI":"10.1007\/s10618-020-00701-z","article-title":"ROCKET: Exceptionally fast and accurate time series classification using random convolutional kernels","volume":"34","author":"Dempster","year":"2020","journal-title":"Data Min. Knowl. Discov."},{"key":"ref_33","unstructured":"Compagnoni, E.M., Biggio, L., Orvieto, A., Hofmann, T., and Teichmann, J. (2022). Randomized signature layers for signal extraction in time series data. arXiv."},{"key":"ref_34","unstructured":"Braei, M., and Wagner, S. (2020). Anomaly detection in univariate time-series: A survey on the state-of-the-art. arXiv."},{"key":"ref_35","unstructured":"Shyu, M.L., Chen, S.C., Sarinnapakorn, K., and Chang, L. (2006). Principal component-based anomaly detection scheme. Foundations and Novel Approaches in Data Mining, Springer."},{"key":"ref_36","doi-asserted-by":"crossref","unstructured":"Ringberg, H., Soule, A., Rexford, J., and Diot, C. (2007, January 12\u201316). Sensitivity of PCA for traffic anomaly detection. Proceedings of the ACM SIGMETRICS International Conference on Measurement and Modeling of Computer Systems, San Diego, CA, USA.","DOI":"10.1145\/1254882.1254895"},{"key":"ref_37","unstructured":"Bin, X., Zhao, Y., and Shen, B. (2016). Abnormal Subspace Sparse PCA for Anomaly Detection and Interpretation. arXiv."},{"key":"ref_38","doi-asserted-by":"crossref","first-page":"500","DOI":"10.1109\/TST.2016.7590319","article-title":"PCA-based network traffic anomaly detection","volume":"21","author":"Ding","year":"2016","journal-title":"Tsinghua Sci. Technol."},{"key":"ref_39","first-page":"226","article-title":"A density-based algorithm for discovering clusters in large spatial databases with noise","volume":"96","author":"Ester","year":"1996","journal-title":"Proc. Kdd"},{"key":"ref_40","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1145\/3068335","article-title":"DBSCAN revisited, revisited: Why and how you should (still) use DBSCAN","volume":"42","author":"Schubert","year":"2017","journal-title":"ACM Trans. Database Syst. (TODS)"},{"key":"ref_41","doi-asserted-by":"crossref","first-page":"621","DOI":"10.2165\/00002018-200730070-00010","article-title":"Principles of data mining","volume":"30","author":"Hand","year":"2007","journal-title":"Drug Saf."},{"key":"ref_42","doi-asserted-by":"crossref","first-page":"273","DOI":"10.1007\/BF00994018","article-title":"Support-vector networks","volume":"20","author":"Cortes","year":"1995","journal-title":"Mach. Learn."},{"key":"ref_43","doi-asserted-by":"crossref","first-page":"1479","DOI":"10.1109\/TKDE.2019.2947676","article-title":"Extended isolation forest","volume":"33","author":"Hariri","year":"2019","journal-title":"IEEE Trans. Knowl. Data Eng."},{"key":"ref_44","doi-asserted-by":"crossref","unstructured":"Breunig, M.M., Kriegel, H.P., Ng, R.T., and Sander, J. (2000, January 16\u201318). LOF: Identifying density-based local outliers. Proceedings of the 2000 ACM SIGMOD International Conference on Management of Data, Dallas, TX, USA.","DOI":"10.1145\/342009.335388"},{"key":"ref_45","doi-asserted-by":"crossref","unstructured":"Alghushairy, O., Alsini, R., Soule, T., and Ma, X. (2020). A review of local outlier factor algorithms for outlier detection in big data streams. Big Data Cogn. Comput., 5.","DOI":"10.3390\/bdcc5010001"},{"key":"ref_46","unstructured":"Fuller, W.A. (2009). Introduction to Statistical Time Series, Wiley."}],"container-title":["Algorithms"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/www.mdpi.com\/1999-4893\/15\/10\/385\/pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,10,11]],"date-time":"2025-10-11T00:57:20Z","timestamp":1760144240000},"score":1,"resource":{"primary":{"URL":"https:\/\/www.mdpi.com\/1999-4893\/15\/10\/385"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2022,10,19]]},"references-count":46,"journal-issue":{"issue":"10","published-online":{"date-parts":[[2022,10]]}},"alternative-id":["a15100385"],"URL":"https:\/\/doi.org\/10.3390\/a15100385","relation":{},"ISSN":["1999-4893"],"issn-type":[{"value":"1999-4893","type":"electronic"}],"subject":[],"published":{"date-parts":[[2022,10,19]]}}}