The Role of ImageNet Classes in Fréchet Inception Distance

dc.contributorAalto-yliopistofi
dc.contributorAalto Universityen
dc.contributor.authorKynkäänniemi, Tuomasen_US
dc.contributor.authorKarras, Teroen_US
dc.contributor.authorAittala, Miikaen_US
dc.contributor.authorAila, Timoen_US
dc.contributor.authorLehtinen, Jaakkoen_US
dc.contributor.departmentDepartment of Computer Scienceen
dc.contributor.groupauthorProfessorship Lehtinen Jaakkoen
dc.contributor.groupauthorComputer Science Professorsen
dc.contributor.groupauthorComputer Science - Visual Computing (VisualComputing) - Research areaen
dc.contributor.groupauthorComputer Science - Artificial Intelligence and Machine Learning (AIML) - Research areaen
dc.contributor.groupauthorHelsinki Institute for Information Technology (HIIT)en
dc.contributor.organizationNvidiaen_US
dc.date.accessioned2023-12-11T09:29:06Z
dc.date.available2023-12-11T09:29:06Z
dc.date.issued2023-05-01en_US
dc.description| openaire: EC/H2020/866435/EU//PIPE
dc.description.abstractFréchet Inception Distance (FID) is the primary metric for ranking models in data-driven generative modeling. While remarkably successful, the metric is known to sometimes disagree with human judgement. We investigate a root cause of these discrepancies, and visualize what FID "looks at" in generated images. We show that the feature space that FID is (typically) computed in is so close to the ImageNet classifications that aligning the histograms of Top-N classifications between sets of generated and real images can reduce FID substantially -- without actually improving the quality of results. Thus, we conclude that FID is prone to intentional or accidental distortions. As a practical example of an accidental distortion, we discuss a case where an ImageNet pre-trained FastGAN achieves a FID comparable to StyleGAN2, while being worse in terms of human evaluation.en
dc.description.versionPeer revieweden
dc.format.extent26
dc.identifier.citationKynkäänniemi, T, Karras, T, Aittala, M, Aila, T & Lehtinen, J 2023, The Role of ImageNet Classes in Fréchet Inception Distance. in 11th International Conference on Learning Representations (ICLR 2023). Curran Associates Inc., International Conference on Learning Representations, Kigali, Rwanda, 01/05/2023. < https://arxiv.org/abs/2203.06026 >en
dc.identifier.isbn9781713899259
dc.identifier.otherPURE UUID: 06da1068-4c5e-46aa-878d-4eff54b8c2d4en_US
dc.identifier.otherPURE ITEMURL: https://research.aalto.fi/en/publications/06da1068-4c5e-46aa-878d-4eff54b8c2d4en_US
dc.identifier.otherPURE LINK: https://www.proceedings.com/75096.htmlen_US
dc.identifier.otherPURE LINK: https://arxiv.org/abs/2203.06026en_US
dc.identifier.otherPURE LINK: https://openreview.net/forum?id=4oXTQ6m_ws8en_US
dc.identifier.urihttps://aaltodoc.aalto.fi/handle/123456789/124767
dc.identifier.urnURN:NBN:fi:aalto-202312117135
dc.language.isoenen
dc.relationinfo:eu-repo/grantAgreement/EC/H2020/866435/EU//PIPEen_US
dc.relation.ispartofInternational Conference on Learning Representationsen
dc.relation.ispartofseries11th International Conference on Learning Representations (ICLR 2023)en
dc.rightsopenAccessen
dc.titleThe Role of ImageNet Classes in Fréchet Inception Distanceen
dc.typeA4 Artikkeli konferenssijulkaisussafi

Files