<?xml version="1.0" encoding="UTF-8"?>
<crossref_result xmlns="http://www.crossref.org/qrschema/3.0" version="3.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://www.crossref.org/qrschema/3.0 http://www.crossref.org/schemas/crossref_query_output3.0.xsd">
  <query_result>
    <head>
      <doi_batch_id>none</doi_batch_id>
    </head>
    <body>
      <query status="resolved">
        <doi type="journal_article">10.1049/iet-cvi.2016.0326</doi>
        <crm-item name="publisher-name" type="string">Institution of Engineering and Technology (IET)</crm-item>
        <crm-item name="prefix-name" type="string">Institution of Engineering and Technology (IET)</crm-item>
        <crm-item name="member-id" type="number">265</crm-item>
        <crm-item name="citation-id" type="number">89889654</crm-item>
        <crm-item name="journal-id" type="number">64226</crm-item>
        <crm-item name="deposit-timestamp" type="number">2025102702292000359</crm-item>
        <crm-item name="owner-prefix" type="string">10.1049</crm-item>
        <crm-item name="last-update" type="date">2025-10-27T11:16:26Z</crm-item>
        <crm-item name="created" type="date">2017-04-28T22:14:27Z</crm-item>
        <crm-item name="citedby-count" type="number">15</crm-item>
        <doi_record>
          <crossref xmlns="http://www.crossref.org/xschema/1.1" xsi:schemaLocation="http://www.crossref.org/xschema/1.1 http://doi.crossref.org/schemas/unixref1.1.xsd">
            <journal>
              <journal_metadata language="en">
                <full_title>IET Computer Vision</full_title>
                <abbrev_title>IET Computer Vision</abbrev_title>
                <issn media_type="print">1751-9632</issn>
                <issn media_type="electronic">1751-9640</issn>
              </journal_metadata>
              <journal_issue>
                <publication_date media_type="print">
                  <month>10</month>
                  <year>2017</year>
                </publication_date>
                <journal_volume>
                  <volume>11</volume>
                </journal_volume>
                <issue>7</issue>
                <doi_data>
                  <doi>10.1049/cvi2.v11.7</doi>
                  <resource>https://ietresearch.onlinelibrary.wiley.com/toc/17519640/11/7</resource>
                </doi_data>
              </journal_issue>
              <journal_article publication_type="full_text">
                <titles>
                  <title>Human‐action recognition using a multi‐layered fusion scheme of Kinect modalities</title>
                </titles>
                <contributors>
                  <person_name contributor_role="author" sequence="first">
                    <given_name>Bassem</given_name>
                    <surname>Seddik</surname>
                    <affiliation>LATIS Laboratory, National Engineering School of Sousse University of Sousse Sousse Tunisia</affiliation>
                    <affiliation>National Engineering School of Sfax University of Sfax Sfax Tunisia</affiliation>
                    <ORCID>http://orcid.org/0000-0003-0617-686X</ORCID>
                  </person_name>
                  <person_name contributor_role="author" sequence="additional">
                    <given_name>Sami</given_name>
                    <surname>Gazzah</surname>
                    <affiliation>LATIS Laboratory, National Engineering School of Sousse University of Sousse Sousse Tunisia</affiliation>
                  </person_name>
                  <person_name contributor_role="author" sequence="additional">
                    <given_name>Najoua</given_name>
                    <surname>Essoukri Ben Amara</surname>
                    <affiliation>LATIS Laboratory, National Engineering School of Sousse University of Sousse Sousse Tunisia</affiliation>
                  </person_name>
                </contributors>
                <jats:abstract xmlns:jats="http://www.ncbi.nlm.nih.gov/JATS1" abstract-type="main">
                  <jats:p>This study addresses the problem of efficiently combining the joint, RGB and depth modalities of the Kinect sensor in order to recognise human actions. For this purpose, a multi‐layered fusion scheme concatenates different specific features, builds specialised local and global SVM models and then iteratively fuses their different scores. The authors essentially contribute in two levels: (i) they combine the performance of local descriptors with the strength of global bags‐of‐visual‐words representations. They are able then to generate improved local decisions that allow noisy frames handling. (ii) They also study the performance of multiple fusion schemes guided by different features concatenations, Fisher vectors representations concatenation and later iterative scores fusion. To prove the efficiency of their approach, they have evaluated their experiments on two challenging public datasets: CAD‐60 and CGC‐2014. Competitive results are obtained for both benchmarks.</jats:p>
                </jats:abstract>
                <publication_date media_type="online">
                  <month>08</month>
                  <day>18</day>
                  <year>2017</year>
                </publication_date>
                <publication_date media_type="print">
                  <month>10</month>
                  <year>2017</year>
                </publication_date>
                <pages>
                  <first_page>530</first_page>
                  <last_page>540</last_page>
                </pages>
                <publisher_item>
                  <identifier id_type="doi">10.1049/iet-cvi.2016.0326</identifier>
                </publisher_item>
                <archive_locations>
                  <archive name="Portico" />
                </archive_locations>
                <ai:program xmlns:ai="http://www.crossref.org/AccessIndicators.xsd" name="AccessIndicators">
                  <ai:license_ref applies_to="vor" start_date="2017-08-18">http://onlinelibrary.wiley.com/termsAndConditions#vor</ai:license_ref>
                </ai:program>
                <doi_data>
                  <doi>10.1049/iet-cvi.2016.0326</doi>
                  <resource>https://ietresearch.onlinelibrary.wiley.com/doi/10.1049/iet-cvi.2016.0326</resource>
                  <collection property="crawler-based">
                    <item crawler="iParadigms">
                      <resource>https://ietresearch.onlinelibrary.wiley.com/doi/pdf/10.1049/iet-cvi.2016.0326</resource>
                    </item>
                  </collection>
                  <collection property="text-mining">
                    <item>
                      <resource mime_type="application/pdf">https://onlinelibrary.wiley.com/doi/pdf/10.1049/iet-cvi.2016.0326</resource>
                    </item>
                    <item>
                      <resource mime_type="application/xml">https://onlinelibrary.wiley.com/doi/full-xml/10.1049/iet-cvi.2016.0326</resource>
                    </item>
                  </collection>
                  <collection property="list-based" multi-resolution="lock" />
                </doi_data>
                <citation_list>
                  <citation key="e_1_2_7_2_2">
                    <doi>10.1049/iet-cvi.2015.0321</doi>
                  </citation>
                  <citation key="e_1_2_7_3_2">
                    <doi>10.1016/j.patrec.2014.04.011</doi>
                  </citation>
                  <citation key="e_1_2_7_4_2">
                    <doi>10.1049/iet-cvi.2013.0323</doi>
                  </citation>
                  <citation key="e_1_2_7_5_2">
                    <doi>10.1049/iet-cvi.2015.0291</doi>
                  </citation>
                  <citation key="e_1_2_7_6_2">
                    <doi>10.3389/frobt.2015.00028</doi>
                  </citation>
                  <citation key="e_1_2_7_7_2">
                    <doi provider="crossref">10.1007/978-3-319-46448-0_10</doi>
                    <unstructured_citation>Haque A. Peng B. Luo Z. et al: ‘Towards viewpoint invariant 3d human pose estimation’.Proc. ECCV 2016 pp.160–177</unstructured_citation>
                  </citation>
                  <citation key="e_1_2_7_8_2">
                    <doi provider="crossref">10.1007/978-3-319-10602-1_37</doi>
                    <unstructured_citation>Wang L. Qiao Y. Tang X.: ‘Video action detection with relational dynamic‐poselets’.Proc. ECCV 2014 pp.565–580</unstructured_citation>
                  </citation>
                  <citation key="e_1_2_7_9_2">
                    <doi>10.1007/s11263-016-0917-2</doi>
                  </citation>
                  <citation key="e_1_2_7_10_2">
                    <doi provider="crossref">10.1109/CVPR.2008.4587756</doi>
                    <unstructured_citation>Laptev I. Marszalek M. Schmid C. et al: ‘Learning realistic human actions from movies’.Proc. CVPR 2008 pp.1–8</unstructured_citation>
                  </citation>
                  <citation key="e_1_2_7_11_2">
                    <doi provider="crossref">10.1109/ICCV.2013.396</doi>
                    <unstructured_citation>Jhuang H. Gall J. Zuffi S. et al: ‘Towards understanding action recognition’.Proc. ICCV 2013 pp.3192–3199</unstructured_citation>
                  </citation>
                  <citation key="e_1_2_7_12_2">
                    <doi provider="crossref">10.1109/ICRA.2012.6224591</doi>
                    <unstructured_citation>Sung J. Ponce C. Selman B. et al: ‘Unstructured human activity detection from rgbd images’.Proc. ICRA 2012 pp.842–849</unstructured_citation>
                  </citation>
                  <citation key="e_1_2_7_13_2">
                    <doi>10.1007/s00138-014-0596-3</doi>
                  </citation>
                  <citation key="e_1_2_7_14_2">
                    <doi provider="crossref">10.1007/978-3-319-16178-5_32</doi>
                    <unstructured_citation>Escalera S. Baró X. Gonzàlez J. et al: ‘Chalearn looking at people challenge 2014: dataset and results’.Proc. ECCV Workshops 2014 pp.459–473</unstructured_citation>
                  </citation>
                  <citation key="e_1_2_7_15_2">
                    <doi>10.1016/j.neucom.2015.09.116</doi>
                  </citation>
                  <citation key="e_1_2_7_16_2">
                    <unstructured_citation>Krizhevsky A. Sutskever I. Hinton G.E.: ‘ImageNet classification with deep convolutional neural networks’.Proc. NIPS 2012 pp.1097–1105</unstructured_citation>
                  </citation>
                  <citation key="e_1_2_7_17_2">
                    <doi provider="crossref">10.1007/978-3-642-15561-1_11</doi>
                    <unstructured_citation>Perronnin F. Sánchez J. Mensink T.: ‘Improving the Fisher kernel for large‐scale image classification’.Proc. ECCV 2010 pp.143–156</unstructured_citation>
                  </citation>
                  <citation key="e_1_2_7_18_2">
                    <doi provider="crossref">10.1109/ICCV.2015.222</doi>
                    <unstructured_citation>Pfister T. Charles J. Zisserman A.: ‘Flowing convNets for human pose estimation in videos’.Proc. ICCV 2015 pp.1913–1921</unstructured_citation>
                  </citation>
                  <citation key="e_1_2_7_19_2">
                    <doi>10.1109/TPAMI.2015.2461544</doi>
                  </citation>
                  <citation key="e_1_2_7_20_2">
                    <doi provider="crossref">10.1109/CVPR.2015.7299059</doi>
                    <unstructured_citation>Wang L. Qiao Y. Tang X.: ‘Action recognition with trajectory‐pooled deep‐convolutional descriptors’.Proc. CVPR 2015 pp.4305–4314</unstructured_citation>
                  </citation>
                  <citation key="e_1_2_7_21_2">
                    <doi provider="crossref">10.1109/EUSIPCO.2015.7362562</doi>
                    <unstructured_citation>Seddik B. Gazzah S. Essoukri Ben Amara N.: ‘Hands face and joints for multi‐modal human‐action temporal segmentation and recognition’.Proc. EUSIPCO 2015 pp.1143–1147</unstructured_citation>
                  </citation>
                  <citation key="e_1_2_7_22_2">
                    <doi provider="crossref">10.1007/978-3-319-23234-8_65</doi>
                    <unstructured_citation>Seddik B. Gazzah S. Essoukri Ben Amara N.: ‘Modalities combination for Italian sign language extraction and recognition’.Proc. ICIAP 2015 pp.710–721</unstructured_citation>
                  </citation>
                  <citation key="e_1_2_7_23_2">
                    <journal_title>J. Mach. Learn. Res.</journal_title>
                    <author>Wan J.</author>
                    <first_page>2549</first_page>
                    <volume>14</volume>
                    <cYear>2013</cYear>
                    <article_title>One‐shot learning gesture recognition from rgb‐d data using bag of features</article_title>
                  </citation>
                  <citation key="e_1_2_7_24_2">
                    <doi>10.1177/0278364913478446</doi>
                  </citation>
                  <citation key="e_1_2_7_25_2">
                    <doi provider="crossref">10.1007/978-3-319-16178-5_41</doi>
                    <unstructured_citation>Camgöz N.C. Kindiroglu A.A. Akarun L.: ‘Gesture recognition using template based random forest classifiers’.Proc. ECCV Workshops 2014 pp.579–594</unstructured_citation>
                  </citation>
                  <citation key="e_1_2_7_26_2">
                    <doi>10.1016/j.cviu.2016.04.005</doi>
                  </citation>
                  <citation key="e_1_2_7_27_2">
                    <doi>10.1109/THMS.2014.2377111</doi>
                  </citation>
                  <citation key="e_1_2_7_28_2">
                    <doi provider="crossref">10.1007/978-3-319-16178-5_34</doi>
                    <unstructured_citation>Monnier C. German S. Ost A.: ‘A multi‐scale boosted detector for efficient and robust gesture recognition’.Proc. ECCV Workshops 2014 pp.491–502</unstructured_citation>
                  </citation>
                  <citation key="e_1_2_7_29_2">
                    <doi provider="crossref">10.1109/ARSO.2014.7020983</doi>
                    <unstructured_citation>Shan J. Akella S.: ‘3d human action segmentation and recognition using pose kinetic energy’.Proc. ARSO 2014 pp.69–75</unstructured_citation>
                  </citation>
                  <citation key="e_1_2_7_30_2">
                    <doi provider="crossref">10.1109/ICCV.2013.342</doi>
                    <unstructured_citation>Zanfir M. Leordeanu M. Sminchisescu C.: ‘The moving pose: an efficient 3d kinematics descriptor for low‐latency action recognition and detection’.Proc. ICCV 2013 pp.2752–2759</unstructured_citation>
                  </citation>
                  <citation key="e_1_2_7_31_2">
                    <doi provider="crossref">10.1007/978-3-319-16178-5_35</doi>
                    <unstructured_citation>Chang J.Y.: ‘Nonparametric gesture labeling from multi‐modal data’.Proc. ECCV Workshops 2014 pp.503–517</unstructured_citation>
                  </citation>
                  <citation key="e_1_2_7_32_2">
                    <doi provider="crossref">10.1109/ROMAN.2014.6926340</doi>
                    <unstructured_citation>Faria D.R. Premebida C. Nunes U.: ‘A probabilistic approach for human everyday activities recognition using body motion from rgb‐d images’.Proc. RO‐MAN 2014 pp.732–737</unstructured_citation>
                  </citation>
                  <citation key="e_1_2_7_33_2">
                    <doi>10.1155/2016/4351435</doi>
                  </citation>
                  <citation key="e_1_2_7_34_2">
                    <doi>10.1109/TPAMI.2015.2439257</doi>
                  </citation>
                  <citation key="e_1_2_7_35_2">
                    <doi>10.1049/iet-cvi.2015.0233</doi>
                  </citation>
                  <citation key="e_1_2_7_36_2">
                    <doi>10.1016/j.patrec.2013.09.009</doi>
                  </citation>
                  <citation key="e_1_2_7_37_2">
                    <doi provider="crossref">10.1109/CVPR.2011.5995407</doi>
                    <unstructured_citation>Wang H. Kläser A. Schmid C. et al: ‘Action recognition by dense trajectories’.Proc. CVPR 2011 pp.3169–3176</unstructured_citation>
                  </citation>
                  <citation key="e_1_2_7_38_2">
                    <doi provider="crossref">10.1109/ICCV.2013.441</doi>
                    <unstructured_citation>Wang H. Schmid C.: ‘Action recognition with improved trajectories’.Proc. ICCV 2013 pp.3551–3558</unstructured_citation>
                  </citation>
                  <citation key="e_1_2_7_39_2">
                    <doi>10.1016/j.patrec.2013.10.010</doi>
                  </citation>
                  <citation key="e_1_2_7_40_2">
                    <doi provider="crossref">10.1007/978-3-319-16178-5_44</doi>
                    <unstructured_citation>Liang B. Zheng L.: ‘Multi‐modal gesture recognition using skeletal joints and motion trail model’.Proc. ECCV Workshops 2014 pp.623–638</unstructured_citation>
                  </citation>
                  <citation key="e_1_2_7_41_2">
                    <doi provider="crossref">10.1109/CVPR.2013.98</doi>
                    <unstructured_citation>Oreifej O. Liu Z.: ‘Hon4d: histogram of oriented 4d normals for activity recognition from depth sequences’.Proc. CVPR 2013 pp.716–723</unstructured_citation>
                  </citation>
                  <citation key="e_1_2_7_42_2">
                    <doi>10.1016/j.cviu.2015.05.010</doi>
                  </citation>
                  <citation key="e_1_2_7_43_2">
                    <doi>10.1016/j.imavis.2014.04.005</doi>
                  </citation>
                  <citation key="e_1_2_7_44_2">
                    <doi>10.3389/fnbot.2015.00003</doi>
                  </citation>
                  <citation key="e_1_2_7_45_2">
                    <doi>10.1049/iet-cvi.2013.0306</doi>
                  </citation>
                  <citation key="e_1_2_7_46_2">
                    <doi>10.1016/j.cviu.2016.03.013</doi>
                  </citation>
                  <citation key="e_1_2_7_47_2">
                    <doi provider="crossref">10.1007/978-3-319-16178-5_36</doi>
                    <unstructured_citation>Peng X. Wang L. Cai Z. et al: ‘Action and gesture temporal spotting with super vector representation’.Proc. ECCV Workshops 2014 pp.518–527</unstructured_citation>
                  </citation>
                  <citation key="e_1_2_7_48_2">
                    <doi>10.1049/iet-cvi.2013.0015</doi>
                  </citation>
                  <citation key="e_1_2_7_49_2">
                    <doi>10.1016/j.patrec.2014.07.011</doi>
                  </citation>
                  <citation key="e_1_2_7_50_2">
                    <journal_title>Int. J. Comput. Vis.</journal_title>
                    <author>Pigou L.</author>
                    <first_page>1</first_page>
                    <volume>124</volume>
                    <cYear>2016</cYear>
                    <article_title>Beyond temporal pooling: recurrence and temporal convolutions for gesture recognition in video</article_title>
                  </citation>
                  <citation key="e_1_2_7_51_2">
                    <doi>10.1109/TPAMI.2016.2537340</doi>
                  </citation>
                  <citation key="e_1_2_7_52_2">
                    <doi>10.1049/iet-cvi.2015.0235</doi>
                  </citation>
                  <citation key="e_1_2_7_53_2">
                    <doi provider="crossref">10.1007/978-3-642-33709-3_13</doi>
                    <unstructured_citation>Ni B. Moulin P. Yan S.: ‘Order‐Preserving sparse coding for sequence classification’.Proc. ECCV 2012 pp.173–187</unstructured_citation>
                  </citation>
                  <citation key="e_1_2_7_54_2">
                    <doi>10.1016/j.jvcir.2013.03.001</doi>
                  </citation>
                  <citation key="e_1_2_7_55_2">
                    <doi provider="crossref">10.1109/CVPR.2016.456</doi>
                    <unstructured_citation>Molchanov P. Yang X. Gupta S. et al: ‘Online detection and classification of dynamic hand gestures with recurrent 3d convolutional neural networks’.Proc. CVPR 2016 pp.4207–4215</unstructured_citation>
                  </citation>
                  <citation key="e_1_2_7_56_2">
                    <doi provider="crossref">10.1007/978-3-319-16178-5_42</doi>
                    <unstructured_citation>Evangelidis G.D. Singh G. Horaud R.: ‘Continuous gesture recognition from articulated poses’.Proc. ECCV Workshops 2014 pp.595–607</unstructured_citation>
                  </citation>
                  <citation key="e_1_2_7_57_2">
                    <doi provider="crossref">10.1109/SSD.2013.6564032</doi>
                    <unstructured_citation>Seddik B. Maâmatou H. Gazzah S. et al: ‘Unsupervised facial expressions recognition and avatar reconstruction from kinect’.Proc. SSD 2013 pp.1–6</unstructured_citation>
                  </citation>
                </citation_list>
              </journal_article>
            </journal>
          </crossref>
        </doi_record>
      </query>
    </body>
  </query_result>
</crossref_result>