{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"institution":[{"id":[{"id":"https:\/\/ror.org\/03mb6wj31","id-type":"ROR","asserted-by":"publisher"},{"id":"https:\/\/www.isni.org\/000000041937028X","id-type":"ISNI","asserted-by":"publisher"},{"id":"https:\/\/www.wikidata.org\/entity\/Q1640731","id-type":"wikidata","asserted-by":"publisher"}],"name":"Universitat Polit\u00e8cnica de Catalunya","acronym":["UPC"]}],"indexed":{"date-parts":[[2026,1,29]],"date-time":"2026-01-29T18:10:25Z","timestamp":1769710225886,"version":"3.49.0"},"reference-count":0,"publisher":"Universitat Polit\u00e8cnica de Catalunya","content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":[],"abstract":"<jats:p>Human body analysis is one of the broadest areas within the computer vision field. Researchers have put a strong effort in the human body analysis area, specially over the last decade, due to the technological improvements in both video cameras and processing power. Human body analysis covers topics such as person  detection and segmentation, human motion tracking or action and behavior recognition. Even if human beings perform all these tasks naturally, they build-up a challenging problem from a computer vision point of view. Adverse situations such as viewing perspective, clutter and occlusions, lighting conditions or variability of behavior amongst persons may turn human body analysis into an arduous task.\r\nIn the computer vision field, the evolution of research works is usually tightly related to the technological progress of camera sensors and computer processing power. Traditional human body analysis methods are based on color cameras. Thus, the information is extracted from the raw color data, strongly limiting the proposals. An interesting quality leap was achieved by introducing the multiview concept. That is to say, having multiple color cameras recording a single scene at the same time. With multiview approaches, 3D information is available by means of stereo matching algorithms. The fact of having 3D information is a key aspect in human motion analysis, since the human body moves in a three-dimensional space. Thus, problems such as occlusion and clutter may be overcome with 3D information. \r\nThe appearance of commercial depth cameras has supposed a second leap in the human body analysis field. While traditional multiview approaches required a cumbersome and expensive setup, as well as a fine camera calibration; novel depth cameras directly provide 3D information with a single camera sensor. Furthermore, depth cameras may be rapidly installed in a wide range of situations, enlarging the range of applications with respect to multiview approaches. Moreover, since depth cameras are based on infra-red light, they do not suffer from illumination variations. \r\nIn this thesis, we focus on the study of depth data applied to the human body analysis problem. We propose novel ways of describing depth data through specific descriptors, so that they emphasize helpful characteristics of the scene for further body analysis. These descriptors exploit the special 3D structure of depth data to outperform generalist 3D descriptors or color based ones. We also study the problem of person detection, proposing a highly robust and fast method to detect heads. Such method is extended to a hand tracker, which is used throughout the thesis as a helpful tool to enable further research. In the remainder of this dissertation, we focus on the hand analysis problem as a subarea of human body analysis. Given the recent appearance of depth cameras, there is a lack of public datasets. We contribute with a  dataset for hand gesture recognition and fingertip localization using depth data.  This dataset acts as a starting point of two proposals for hand gesture recognition and fingertip localization based on classification techniques. In these methods, we also exploit the above mentioned descriptor proposals to finely adapt to the nature of depth data.%, and enhance the results in front of traditional color-based methods.<\/jats:p>\n                <jats:p>L\u2019an\u00e0lisi del cos hum\u00e0 \u00e9s una de les \u00e0rees m\u00e9s \u00e0mplies del camp de la visi\u00f3 per computador. Els investigadors han posat un gran esfor\u00e7 en el camp de l\u2019an\u00e0lisi del cos hum\u00e0, sobretot durant la darrera d\u00e8cada, degut als grans aven\u00e7os tecnol\u00f2gics, tant pel que fa a les c\u00e0meres com a la potencia de c\u00e0lcul. L\u2019an\u00e0lisi del cos hum\u00e0 engloba varis temes com la detecci\u00f3 i segmentaci\u00f3 de persones, el seguiment del moviment del cos, o el reconeixement d'accions. Tot i que els essers humans duen a terme aquestes tasques d'una manera natural, es converteixen en un dif\u00edcil problema quan s'ataca des de l\u2019\u00f2ptica de la visi\u00f3 per computador. Situacions adverses, com poden ser la perspectiva del punt de vista, les oclusions, les condicions d\u2019il\u2022luminaci\u00f3 o la variabilitat de comportament entre persones, converteixen l\u2019an\u00e0lisi del cos hum\u00e0 en una tasca complicada.\r\nEn el camp de la visi\u00f3 per computador, l\u2019evoluci\u00f3 de la recerca va sovint lligada al progr\u00e9s tecnol\u00f2gic, tant dels sensors com de la potencia de c\u00e0lcul dels ordinadors. Els m\u00e8todes tradicionals d\u2019an\u00e0lisi del cos hum\u00e0 estan basats en c\u00e0meres de color. Aix\u00f2 limita molt els enfocaments, ja que la informaci\u00f3 disponible prov\u00e9 \u00fanicament de les dades de color.\r\nEl concepte multivista va suposar salt de qualitat important. En els enfocaments multivista es tenen m\u00faltiples c\u00e0meres gravant una mateixa escena simult\u00e0niament, permetent utilitzar informaci\u00f3 3D gr\u00e0cies a algorismes de combinaci\u00f3 est\u00e8reo. El fet de disposar d\u2019informaci\u00f3 3D es un punt clau, ja que el cos hum\u00e0 es mou en un espai tri-dimensional.\r\nAix\u00f2 doncs, problemes com les oclusions es poden apaivagar si es disposa de informaci\u00f3 3D. L\u2019aparici\u00f3 de les c\u00e0meres de profunditat comercials ha suposat un segon salt en el camp de l\u2019an\u00e0lisi del cos hum\u00e0. Mentre els m\u00e8todes multivista tradicionals requereixen un muntatge pesat i car, i una celebraci\u00f3 precisa de totes les c\u00e0meres; les noves c\u00e0meres de profunditat ofereixen informaci\u00f3 3D de forma directa amb un sol sensor. Aquestes c\u00e0meres es poden instal\u2022lar r\u00e0pidament en una gran varietat d'entorns, ampliant enormement l'espectre d'aplicacions, que era molt redu\u00eft amb enfocaments multivista. A m\u00e9s a m\u00e9s, com que les c\u00e0meres de profunditat estan basades en llum infraroja, no pateixen problemes relacionats amb canvis d\u2019il\u2022luminaci\u00f3.\r\n\r\nEn aquesta tesi, ens centrem en l'estudi de la informaci\u00f3 que ofereixen les c\u00e0meres de\r\nprofunditat, i la seva aplicaci\u00f3 al problema d\u2019an\u00e0lisi del cos hum\u00e0. Proposem noves\r\nvies per descriure les dades de profunditat mitjan\u00e7ant descriptors espec\u00edfics, capa\u00e7os\r\nd'emfatitzar caracter\u00edstiques de l'escena que seran \u00fatils de cara a una posterior an\u00e0lisi\r\ndel cos hum\u00e0. Aquests descriptors exploten l'estructura 3D de les dades de profunditat\r\nper superar descriptors 3D generalistes o basats en color. Tamb\u00e9 estudiem el problema de detecci\u00f3 de persones, proposant un m\u00e8tode per detectar caps robust i r\u00e0pid. \r\nAmpliem aquest m\u00e8tode per obtenir un algorisme de seguiment de mans que ha estat utilitzat al llarg de la tesi. En la part final del document, ens centrem en l\u2019an\u00e0lisi de les mans com a sub\u00e0rea de l\u2019an\u00e0lisi del cos hum\u00e0. Degut a la recent aparici\u00f3 de les c\u00e0meres de profunditat, hi ha una manca de bases de dades p\u00fabliques. \r\nContribu\u00efm amb una base de dades pensada per la localitzaci\u00f3 de dits i el reconeixement de gestos utilitzant dades de profunditat. Aquesta base de dades \u00e9s el punt de partida de dues contribucions sobre localitzaci\u00f3 de dits i reconeixement de gestos basades en t\u00e8cniques de classificaci\u00f3. En aquests m\u00e8todes, tamb\u00e9 explotem les ja mencionades propostes de descriptors per millor adaptar-nos a la naturalesa de les dades de profunditat.<\/jats:p>","DOI":"10.5821\/dissertation-2117-95252","type":"dissertation","created":{"date-parts":[[2023,10,11]],"date-time":"2023-10-11T03:15:42Z","timestamp":1696994142000},"approved":{"date-parts":[[2013,12,4]]},"source":"Crossref","is-referenced-by-count":0,"title":["Human body analysis using depth data"],"prefix":"10.5821","author":[{"sequence":"additional","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Xavier","family":"Suau Cuadros","sequence":"first","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"3865","container-title":[],"original-title":[],"deposited":{"date-parts":[[2026,1,29]],"date-time":"2026-01-29T06:47:52Z","timestamp":1769669272000},"score":1,"resource":{"primary":{"URL":"https:\/\/hdl.handle.net\/2117\/95252"}},"subtitle":[],"editor":[{"given":"Javier","family":"Ruiz Hidalgo","sequence":"first","affiliation":[],"role":[{"role":"editor","vocabulary":"crossref"}]},{"given":"Josep Ramon","family":"Casas Pla","sequence":"additional","affiliation":[],"role":[{"role":"editor","vocabulary":"crossref"}]}],"short-title":[],"issued":{"date-parts":[[null]]},"references-count":0,"URL":"https:\/\/doi.org\/10.5821\/dissertation-2117-95252","relation":{},"subject":[]}}