{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2024,6,9]],"date-time":"2024-06-09T20:40:02Z","timestamp":1717965602931},"reference-count":39,"publisher":"Springer Science and Business Media LLC","issue":"15","license":[{"start":{"date-parts":[[2015,7,1]],"date-time":"2015-07-01T00:00:00Z","timestamp":1435708800000},"content-version":"tdm","delay-in-days":0,"URL":"http:\/\/www.springer.com\/tdm"}],"content-domain":{"domain":["link.springer.com"],"crossmark-restriction":false},"short-container-title":["Multimed Tools Appl"],"published-print":{"date-parts":[[2016,8]]},"DOI":"10.1007\/s11042-015-2723-1","type":"journal-article","created":{"date-parts":[[2015,6,30]],"date-time":"2015-06-30T08:07:28Z","timestamp":1435651648000},"page":"8999-9023","update-policy":"http:\/\/dx.doi.org\/10.1007\/springer_crossmark_policy","source":"Crossref","is-referenced-by-count":6,"title":["Naming multi-modal clusters to identify persons in TV broadcast"],"prefix":"10.1007","volume":"75","author":[{"given":"Johann","family":"Poignant","sequence":"first","affiliation":[]},{"given":"Guillaume","family":"Fortier","sequence":"additional","affiliation":[]},{"given":"Laurent","family":"Besacier","sequence":"additional","affiliation":[]},{"given":"Georges","family":"Qu\u00e9not","sequence":"additional","affiliation":[]}],"member":"297","published-online":{"date-parts":[[2015,7,1]]},"reference":[{"issue":"5","key":"2723_CR1","doi-asserted-by":"crossref","first-page":"1505","DOI":"10.1109\/TASL.2006.878261","volume":"14","author":"C Barras","year":"2006","unstructured":"Barras C, Zhu X, Meignier S, Gauvain J-L (2006) Multi-stage speaker diarization of broadcast news. IEEE Trans Audio, Speech Language Processing 14(5):1505\u20131512","journal-title":"IEEE Trans Audio, Speech Language Processing"},{"key":"2723_CR2","doi-asserted-by":"crossref","unstructured":"Bendris M, Favre B, Charlet D, Damnati G, Auguste R, Martinet J, Senay G (2013) Unsupervised face identification in TV content using audio-visual sources. In: Proceedings of the 11th international workshop on content-based multimedia indexing (CBMI), pp 243\u2013249","DOI":"10.1109\/CBMI.2013.6576591"},{"key":"2723_CR3","doi-asserted-by":"crossref","unstructured":"B\u00e9chet F, Bendris M, Charlet D, Damnati G, Favre B, Rouvier M, Auguste R, Bigot B, Dufour R, Fredouille C, Linares G, Martinet J, Senay G, Tirilly P (2014) Multimodal understanding for person recognition in video broadcasts. In: 15th annual conference of the internationnal speech communication association (INTERSPEECH)","DOI":"10.21437\/Interspeech.2014-146"},{"key":"2723_CR4","doi-asserted-by":"crossref","unstructured":"Bredin H, Poignant J (2013) Integer Linear Programming for Speaker Diarization and Cross-Modal Identification in TV Broadcast. In: the 14th annual conference of the international speech communication association, (INTERSPEECH)","DOI":"10.21437\/Interspeech.2013-381"},{"key":"2723_CR5","doi-asserted-by":"crossref","unstructured":"Bredin H, Poignant J, Tapaswi M, Fortier G, Le VB, Napoleon T, Gao H, Barras C, Rosset S, Besacier L, Verbeek J, Qu\u00e9not G, Jurie F, Kemal Ekenel H (2012) Fusion of speech, faces and text for person identification in TV broadcast. In: Workshop on information fusion in computer vision for concept recognition, ECCV-IFCVCR, pp 385\u2013394","DOI":"10.1007\/978-3-642-33885-4_39"},{"key":"2723_CR6","unstructured":"Bredin H, Poignant J, Fortier G, Tapaswi M, Le VB, Sarkar A, Barras C, Rosset S, Roy A, Yang Q, Gao H, Mignon A, Verbeek J, Besacier L, Qu\u00e9not G, Kemal Ekenel H, Stiefelhagen R (2013) QCompere at REPERE 2013. In: First workshop on speech, language and audio in multimedia - the 14th annual conference of the international speech communication association, INTERSPEECH-SLAM"},{"key":"2723_CR7","doi-asserted-by":"crossref","unstructured":"Bredin H, Roy A, Le VB, Barras C (2014) Person instance graphs for mono-, cross- and multi-modal person recognition in multimedia data: application to speaker identication in TV broadcast. In: International journal of multimedia information retrieval","DOI":"10.1007\/s13735-014-0055-y"},{"key":"2723_CR8","doi-asserted-by":"crossref","unstructured":"Buml M, Bernardin K, Fischer M, Ekenel HK, Stiefelhagen R (2010) Multi-pose face recognition for person retrieval in camera networks. In: 7th International conference on advanced video and signal-based surveillance, AVSS, pp 441\u2013447","DOI":"10.1109\/AVSS.2010.42"},{"key":"2723_CR9","unstructured":"Canseco-Rodriguez L, Lamel L, Gauvain J-L (2004) Speaker diarization from speech transcripts. In: the 5th annual conference of the international speech communication association, INTERSPEECH"},{"key":"2723_CR10","doi-asserted-by":"crossref","unstructured":"Canseco L, Lamel L, Gauvain J-L (2005) A comparative study using manual and automatic transcriptions for diarization. In: IEEE workshop on automatic speech recognition and understanding, pp 415\u2013419","DOI":"10.1109\/ASRU.2005.1566507"},{"key":"2723_CR11","unstructured":"Chen SS, Gopalakrishnan PS (1998) Speaker, environment and channel change detection and clustering via the Bayesian information criterion. In: DARPA broadcast news transcription and understanding workshop, pp 127\u2013132"},{"key":"2723_CR12","doi-asserted-by":"crossref","unstructured":"Est\u00e8ve Y, Meignier S, Del\u00e9glise P, Mauclair J (2007) Extracting true speaker identities from transcriptions. In: the 8th annual conference of the international speech communication association, INTERSPEECH, pp 2601\u20132604","DOI":"10.21437\/Interspeech.2007-586"},{"key":"2723_CR13","unstructured":"Favre B, Damnati G, B\u00e9chet F, Bendris M, Charlet D, Auguste R, Ayache S, Bigot B, Delteil A, Dufour R, Fredouille C, Linares G, Martinet J, Senay G, Tirilly P (2013) PERCOLI: a person identification system for the 2013 REPERE challenge. In: First workshop on speech, language and audio in multimedia - the 14th annual conference of the international speech communication association, INTERSPEECH"},{"key":"2723_CR14","doi-asserted-by":"crossref","unstructured":"Gay P, Dupuy G, Lailler C, Odobez J-M, Meignier S, Del\u00e9glise P (2014) Comparison of two methods for unsupervised person identification in TV shows. In: 12th international workshop on content-based multimedia indexing (CBMI)","DOI":"10.1109\/CBMI.2014.6849828"},{"key":"2723_CR15","unstructured":"Giraudel A, Carr\u00e9 M, Mapelli V, Kahn J, Galibert O, Quintard L (2012) The REPERE corpus: a multimodal corpus for person recognition. In: the 8th international conference on language resources and evaluation, LREC"},{"key":"2723_CR16","doi-asserted-by":"crossref","unstructured":"Guillaumin M, Verbeek J, Schmid C (2009) Is that you? Metric learning approaches for face identification. In: the IEEE 12th international conference on computer vision, pp 498\u2013505","DOI":"10.1109\/ICCV.2009.5459197"},{"key":"2723_CR17","doi-asserted-by":"crossref","first-page":"45","DOI":"10.1109\/5254.796089","volume":"14","author":"R Houghton","year":"1999","unstructured":"Houghton R (1999) Named faces: putting names to faces. IEEE Intell Syst 14:45\u201350","journal-title":"IEEE Intell Syst"},{"key":"2723_CR18","doi-asserted-by":"crossref","unstructured":"Jousse V, Petit-Renaud S, Meignier S, Est\u00e8ve Y, Jacquin C (2009) Automatic named identification of speakers using diarization and ASR systems","DOI":"10.1109\/ICASSP.2009.4960644"},{"key":"2723_CR19","doi-asserted-by":"crossref","unstructured":"Kahn J, Galibert O, Quintard L, Carr\u00e9 M, Giraudel A, Joly P (2012) A presentation of the REPERE challenge. In: the 10th international workshop on content-based multimedia indexing (CBMI), pp 1\u20136","DOI":"10.1109\/CBMI.2012.6269851"},{"key":"2723_CR20","unstructured":"Khoury E, Snac C, Joly P (2012) Audiovisual diarization of people in video content. In: Multimedia tools and applications"},{"key":"2723_CR21","unstructured":"Le VB, Barras C, Ferr\u00e0s M (2010) On the use of GSV-SVM for speaker diarization and tracking. In: Odyssey - the speaker and language recognition workshop, pp 146\u2013150"},{"key":"2723_CR22","doi-asserted-by":"crossref","unstructured":"Mauclair J, Meignier S, Est\u00e8ve Y (2006) Speaker diarization: about whom the speaker is talking?. In: IEEE Odyssey 2006 - the speaker and language recognition workshop","DOI":"10.1109\/ODYSSEY.2006.248114"},{"key":"2723_CR23","unstructured":"Petit-Renaud S, Jousse V, Meignier S, Est\u00e8ve Y (2010) Identification of speakers by name using belief functions. In: the 13th international conference on information processing and management of uncertainty in knowledge-based systems, theory and methods, IPMU, pp 179\u2013188"},{"key":"2723_CR24","doi-asserted-by":"crossref","unstructured":"Pham PT, Moens M-F, Tuytelaars T (2010) Naming persons in news video with label propagation. In: IEEE international conference on Multimedia and Expo, ICME, p 15281533","DOI":"10.1109\/ICME.2010.5583271"},{"issue":"3","key":"2723_CR25","first-page":"4455","volume":"18","author":"PT Pham","year":"2011","unstructured":"Pham PT, Tuytelaars T, Moens M-F (2011) Naming people in news videos with label propagation. IEEE MultiMedia 18(3):4455","journal-title":"IEEE MultiMedia"},{"key":"2723_CR26","doi-asserted-by":"crossref","unstructured":"Poignant J, Besacier L, Qu\u00e9not G, Thollard F (2012) From text detection in videos to person identification. In: IEEE international conference on multimedia and expo, ICME, pp 854\u2013859","DOI":"10.1109\/ICME.2012.119"},{"key":"2723_CR27","doi-asserted-by":"crossref","unstructured":"Poignant J, Bredin H, Le VB, Besacier L, Barras C, Qu\u00e9not G (2012) Unsupervised speaker identification using overlaid texts in TV broadcast. In: the 13rd annual conference of the international speech communication association, INTERSPEECH, pp 2650\u20132653","DOI":"10.21437\/Interspeech.2012-344"},{"key":"2723_CR28","unstructured":"Poignant J, Besacier L, Qu\u00e9not G (2013) Nommage non-supervis\u00e9 des personnes dans les \u00e9missions de t\u00e9l\u00e9vision: une revue du potentiel de chaque modalit\u00e9. In: la 10\u00e8me cOnf\u00e9rence en recherche d\u2019Information et applications, CORIA"},{"key":"2723_CR29","doi-asserted-by":"crossref","unstructured":"Poignant J, Besacier L, Le VB, Rosset S, Qu\u00e9not G (2013) Unsupervised naming of speakers in broadcast TV: using written names, pronounced names or both ?. In: the 14th annual conference of the international speech communication association, INTERSPEECH","DOI":"10.21437\/Interspeech.2013-380"},{"key":"2723_CR30","doi-asserted-by":"crossref","unstructured":"Poignant J, Bredin H, Besacier L, Qu\u00e9not G, Barras C (2013) Towards a better integration of written names for unsupervised speakers identification in videos. In: First workshop on speech, language and audio in multimedia - the 14th annual conference of the international speech communication association, INTERSPEECH-SLAM","DOI":"10.21437\/Interspeech.2012-344"},{"key":"2723_CR31","doi-asserted-by":"crossref","unstructured":"Poignant J, Besacier L, Qu\u00e9not G (2014) Nommage non-supervis\u00e9 des personnes dans les \u00e9missions de t\u00e9l\u00e9vision: utilisation des noms \u00e9crits, des noms prononc\u00e9s ou des deux?. In: Documents numriques, pp 37\u201360","DOI":"10.3166\/dn.17.1.37-60"},{"key":"2723_CR32","unstructured":"Rouvier M, Meignier S (2012) A Global Optimization Framework For Speaker Diarization. In: Odyssey - the speaker and language recognition workshop"},{"key":"2723_CR33","doi-asserted-by":"crossref","unstructured":"Rouvier M, Favre B, Bendris M, Charlet D, Damnati G (2014) Scene understanding for identifying persons in TV shows: beyond face authentication. In: 12th international workshop on content-based multimedia indexing (CBMI)","DOI":"10.1109\/CBMI.2014.6849829"},{"key":"2723_CR34","doi-asserted-by":"crossref","unstructured":"Sato T, Kanade T, Hughes TK, Smith MA, Satoh S (1999) Video OCR: Indexing digital news libraries by recognition of superimposed caption. In: ACM Multimedia Systems","DOI":"10.1007\/s005300050140"},{"key":"2723_CR35","doi-asserted-by":"crossref","first-page":"22","DOI":"10.1109\/93.752960","volume":"6","author":"S Satoh","year":"1999","unstructured":"Satoh S, Nakamura Y, Kanade T (1999) Name-It: naming and detecting faces in news videos. IEEE Multimedia 6:22\u201335","journal-title":"IEEE Multimedia"},{"key":"2723_CR36","doi-asserted-by":"crossref","unstructured":"Tranter SE (2006) Who really spoke when? finding speaker turns and identities in broadcast news audio. In: the 31st IEEE international conference on acoustics, speech and signal processing, ICASSP, pp 1013\u20131016","DOI":"10.1109\/ICASSP.2006.1660195"},{"key":"2723_CR37","unstructured":"U\u0159i\u010d\u00e1\u0159 M, Franc V, Hlav\u00e1\u010d V (2012) Detector of Facial Landmarks Learned by the Structured Output SVM. In: the 7th international conference on computer vision theory and applications, pp 547\u2013556"},{"key":"2723_CR38","doi-asserted-by":"crossref","unstructured":"Yang J., Hauptmann A G (2004) Naming every individual in news video monologues","DOI":"10.1145\/1027527.1027666"},{"key":"2723_CR39","doi-asserted-by":"crossref","unstructured":"Yang J, Yan R, Hauptmann A G (2005) Multiple instance learning for labeling faces in broadcasting news video. In: the 13th ACM international conference on multimedia, ACMMM, pp 31\u201340","DOI":"10.1145\/1101149.1101155"}],"container-title":["Multimedia Tools and Applications"],"original-title":[],"language":"en","link":[{"URL":"http:\/\/link.springer.com\/content\/pdf\/10.1007\/s11042-015-2723-1.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"text-mining"},{"URL":"http:\/\/link.springer.com\/article\/10.1007\/s11042-015-2723-1\/fulltext.html","content-type":"text\/html","content-version":"vor","intended-application":"text-mining"},{"URL":"http:\/\/link.springer.com\/content\/pdf\/10.1007\/s11042-015-2723-1","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2024,6,9]],"date-time":"2024-06-09T19:32:16Z","timestamp":1717961536000},"score":1,"resource":{"primary":{"URL":"http:\/\/link.springer.com\/10.1007\/s11042-015-2723-1"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2015,7,1]]},"references-count":39,"journal-issue":{"issue":"15","published-print":{"date-parts":[[2016,8]]}},"alternative-id":["2723"],"URL":"https:\/\/doi.org\/10.1007\/s11042-015-2723-1","relation":{},"ISSN":["1380-7501","1573-7721"],"issn-type":[{"value":"1380-7501","type":"print"},{"value":"1573-7721","type":"electronic"}],"subject":[],"published":{"date-parts":[[2015,7,1]]}}}