{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"institution":[{"id":[{"id":"https:\/\/ror.org\/03mb6wj31","id-type":"ROR","asserted-by":"publisher"},{"id":"https:\/\/www.isni.org\/000000041937028X","id-type":"ISNI","asserted-by":"publisher"},{"id":"https:\/\/www.wikidata.org\/entity\/Q1640731","id-type":"wikidata","asserted-by":"publisher"}],"name":"Universitat Polit\u00e8cnica de Catalunya","acronym":["UPC"]}],"indexed":{"date-parts":[[2026,3,19]],"date-time":"2026-03-19T11:26:52Z","timestamp":1773919612613,"version":"3.50.1"},"reference-count":0,"publisher":"Universitat Polit\u00e8cnica de Catalunya","license":[{"content-version":"vor","delay-in-days":0,"URL":"http:\/\/creativecommons.org\/licenses\/by-nc-nd\/4.0\/"}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":[],"abstract":"<jats:p>(English) Nowadays, authorities monitor the concentrations of regulated air pollutants in order to assist in decision-making processes, e.g., for the implementation of traffic restrictions, and mitigate the effects of air pollution. For this purpose, they deploy high-precision instrumentation, the cost of which makes the number of sensors deployed over a region very low. The advent of air pollution low-\r\ncost sensors (LCSs) has opened up the possibility of complementing the authorities' instruments with more measurement points.\r\nUnfortunately, LCSs present inaccuracies, which makes it difficult to include them in a regulated way for decision-making processes of authorities.\r\nIn recent years, enabling technologies such as the internet of things (IoT) and machine learning (ML) have allowed the improvement of the data quality of LCSs. Therefore, this thesis is devoted to the improvement of the data quality of air pollution monitoring LCS networks focusing on two aspects; i) the improvement of data quality at node level using ML-based sensor calibration, and ii) the\r\nimprovement of the sensor network data quality by using measurements from the network sensors with a graph-based approach.\r\nIn the first part of the thesis, the improvement of the data quality of individual sensors is investigated. First, it is evaluated how the\r\nsensor sampling affects the representativeness of the samples. Then, the use of ML techniques, both linear and nonlinear, for the in-situ calibration of LCSs is analyzed. The in-situ sensor calibration task can be seen as a supervised ML learning problem, so techniques such as multiple linear regression (MLR) or support vector regression (SVR) are evaluated. The evaluation shows how\r\nnonlinear techniques improve the quality of pollution estimates significantly. In addition, given the inaccuracies present in LCSs and the difference that exists from one sensor to another of the same manufacturer, the inclusion in the calibration of multiple sensors measuring the same pollutant is investigated. Thereby, the proposed multisensor calibration approach based on ML results in\r\nincreased calibration accuracy.\r\nThe second part of the thesis focuses on the quality of the data reported by a sensor network once deployed over an area. A graph-based approach is proposed to describe the existing relationships between sensors using a graph topology and represent the network measurements as signals defined on the graph, as realized in the graph signal processing (GSP) field. First, different\r\ntechniques have been evaluated to correctly learn the relationships between sensors in a network that can contain both LCSs and high-precision nodes. The most suitable option has proven to be the data-driven GSP model based on signal smoothness. Then, different signal reconstruction techniques coupled with the graph have been studied in order to reconstruct pollution measurements reported by different sensors in a network. Kernel-based techniques and those based on the weights of the Laplacian have been the\r\nmost effective ones. Once these main components have been studied, a graph-based data reconstruction framework has been proposed for different post-processing applications that appear in LCS networks, e.g., missing value imputation and virtual sensing.\r\nThe results have shown how this framework allows for dealing with a wide variety of applications and scenarios that can occur in this context with precision. Finally, another important aspect of this type of network has been addressed, which is the detection of outliers. The Volterra graph-based outlier detection (VGOD) has been proposed, using a graph learned from the data and a signal\r\nreconstruction model based on the Volterra series, to detect and locate outliers. Therefore, the proposed algorithm has been proven\r\nto improve the monitoring and maintenance of heterogeneous air pollution sensor networks by identifying abnormal measurements\r\nand malfunctioning sensors.<\/jats:p>\n                <jats:p>(Espa\u00f1ol) Hoy en d\u00eda, las autoridades vigilan las concentraciones de contaminantes atmosf\u00e9ricos regulados para ayudar en los procesos de toma de decisiones, por ejemplo, en la aplicaci\u00f3n de restricciones de tr\u00e1fico, y mitigar los efectos de la contaminaci\u00f3n atmosf\u00e9rica.\r\nPara ello, despliegan instrumentaci\u00f3n de alta precisi\u00f3n, cuyo coste hace que el n\u00famero de sensores desplegados en una regi\u00f3n sea muy reducido. La aparici\u00f3n de sensores de contaminaci\u00f3n atmosf\u00e9rica de bajo coste (LCS) ha abierto la posibilidad de complementar los instrumentos de las autoridades con m\u00e1s puntos de medici\u00f3n. Desafortunadamente, los LCS presentan imprecisiones, dificultando su inclusi\u00f3n de forma regulada en los procesos de toma de decisiones.\r\nEn los \u00faltimos a\u00f1os, tecnolog\u00edas como el internet de las cosas (IoT) y el aprendizaje autom\u00e1tico (ML) han permitido mejorar la calidad de los datos de los LCSs. Por lo tanto, esta tesis est\u00e1 dedicada a la mejora de la calidad de los datos de las redes de LCS de contaminaci\u00f3n atmosf\u00e9rica, centr\u00e1ndose en dos aspectos: i) la mejora de la calidad de los datos a nivel de nodo utilizando calibraci\u00f3n de sensores basada en ML, y ii) la mejora de la calidad de los datos de la red de sensores utilizando mediciones de los sensores de la propia red mediante un enfoque basado en grafos.\r\nEn la primera parte de la tesis se investiga la mejora de la calidad de los datos de los sensores de forma individual. Primero, se eval\u00faa c\u00f3mo afecta el muestreo de los sensores a la representatividad de las muestras. A continuaci\u00f3n, se analiza el uso de\r\nt\u00e9cnicas ML, tanto lineales como no lineales, para la calibraci\u00f3n in-situ de LCSs. La tarea de calibraci\u00f3n de sensores in-situ puede considerarse un problema de aprendizaje de ML supervisado, por ello se eval\u00faan t\u00e9cnicas como la multiple linear regression (MLR) o support vector regression (SVR). La evaluaci\u00f3n muestra c\u00f3mo las t\u00e9cnicas no lineales mejoran significativamente la calidad de las estimaciones de contaminaci\u00f3n. Adem\u00e1s, dadas las imprecisiones presentes en los LCS y la diferencia que existe de un sensor a otro del mismo fabricante, se investiga la inclusi\u00f3n en la calibraci\u00f3n de m\u00faltiples sensores que miden el mismo contaminante. As\u00ed,\r\nel enfoque propuesto de calibraci\u00f3n multisensor basado en ML permite aumentar la precisi\u00f3n de la calibraci\u00f3n.\r\nLa segunda parte de la tesis se centra en la calidad de los datos medidos por la red de sensores una vez desplegada. Se propone un enfoque basado en grafos para describir las relaciones existentes entre los sensores mediante la topolog\u00eda del grafo y\r\nrepresentar las medidas de la red como se\u00f1ales definidas en el grafo, como en el campo del graph signal processing (GSP). Se han evaluado diferentes t\u00e9cnicas para aprender correctamente las relaciones entre sensores de una red que puede contener tanto LCSs como nodos de alta precisi\u00f3n. El modelo de GSP basado en el smoothness de la se\u00f1al ha resultado ser el mejor. A continuaci\u00f3n, se han estudiado distintas t\u00e9cnicas de reconstrucci\u00f3n de se\u00f1al acopladas al grafo con el fin de reconstruir las\r\nmedidas de contaminaci\u00f3n obtenidas por los distintos sensores de la red. Las t\u00e9cnicas basadas en kernel y las basadas en los pesos del Laplaciano han sido las m\u00e1s efectivas. Luego, se ha propuesto un framework de reconstrucci\u00f3n de datos basado en\r\ngrafos para diferentes aplicaciones de post-procesado que aparecen en las redes de LCSs, por ejemplo, la imputaci\u00f3n de valores perdidos y los sensores virtuales. Los resultados han mostrado c\u00f3mo este framework permite abordar con precisi\u00f3n una amplia variedad de aplicaciones y escenarios que pueden darse en este contexto. Por \u00faltimo, se ha investigado otro aspecto importante de este tipo de redes, la detecci\u00f3n de valores at\u00edpicos. Se ha propuesto el algoritmo Volterra graph-based outlier detection (VGOD), que utiliza un grafo aprendido a partir de los datos y un modelo de reconstrucci\u00f3n de se\u00f1al basado en las series de Volterra, para detectar y localizar medidas an\u00f3malas.<\/jats:p>","DOI":"10.5821\/dissertation-2117-399269","type":"dissertation","created":{"date-parts":[[2024,1,13]],"date-time":"2024-01-13T01:22:06Z","timestamp":1705108926000},"approved":{"date-parts":[[2023,5,8]]},"source":"Crossref","is-referenced-by-count":0,"title":["On the data quality improvement of air pollution monitoring low-cost sensor networks using data-driven techniques"],"prefix":"10.5821","author":[{"sequence":"additional","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Pau","family":"Ferrer Cid","sequence":"first","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"3865","container-title":[],"original-title":[],"deposited":{"date-parts":[[2026,3,19]],"date-time":"2026-03-19T06:33:40Z","timestamp":1773902020000},"score":1,"resource":{"primary":{"URL":"https:\/\/hdl.handle.net\/2117\/399269"}},"subtitle":[],"editor":[{"given":"Jorge","family":"Garc\u00eda Vidal","sequence":"first","affiliation":[],"role":[{"role":"editor","vocabulary":"crossref"}]},{"given":"Jos\u00e9 Mar\u00eda","family":"Barcel\u00f3 Ordinas","sequence":"additional","affiliation":[],"role":[{"role":"editor","vocabulary":"crossref"}]}],"short-title":[],"issued":{"date-parts":[[null]]},"references-count":0,"URL":"https:\/\/doi.org\/10.5821\/dissertation-2117-399269","relation":{},"subject":[]}}