TY - JOUR
T1 - Predicting and grouping digitized paintings by style using unsupervised feature learning
AU - Gultepe, Eren
AU - Edward. Conturo, Thomas
AU - Makrehchi, Masoud
N1 - Publisher Copyright:
© 2017 Elsevier Masson SAS
PY - 2018/5/1
Y1 - 2018/5/1
N2 - Objective: To create n system to aid in the analysis of art history by classifying and grouping digitized paintings based on stylistic features automatically learned without prior knowledge. Material and methods: 6,776 digitized paintings from eight different artistic styles (Art Nouveau, Baroque, Expressionism, Impressionism, Realism, Romanticism, Renaissance, and Post-Impressionism) were utilized to classify (predict) and cluster (group) paintings according to style. The method of unsupervised feature learning with K-means (UFLK), inspired by deep learning, was utilized to extract features from the paintings. These features were then used in: a support vector machine algorithm to classify the style of new test paintings based on a training set of paintings having known style labels; and a spectral clustering algorithm to group the paintings into distinct style groups (anonymously, without employing any known style labels). Classification performance was determined by accuracy and F-score. Clustering performance was determined by: the ability to recover the original stylistic groupings (using a cost analysis of all possible combinations of eight group label assignments); F-score; and a reliability analysis. The latter analysis used two novel ways to determine the distribution of the null-hypothesis: a uniform distribution projected onto the principal components of the original data; and a randomized, weighted adjacency matrix. The ability to gain insights into art was tested by a semantic analysis of the clustering results. For this purpose, we represented the featural characteristics of each painting by an N-dimensional feature vector, and plotted the distance between vector endpoints (i.e., similarity between paintings). Then, we color-coded the endpoints with the assigned lowest-cost style labels. The scatter plot was visually inspected for separation of the paintings, where the amount of separation between color clusters provides semantic information on the interrelatedness between styles. Results: The UFLK-extracted features resembled the edges/lines/colors in the paintings. For feature-based classification of paintings, the macro-averaged F-score was 0.469. Classification accuracy and F-score were similar/higher compared to other classification methods using more complex feature learning models (e.g., convolutional neural networks, a supervised algorithm). The clustering via UFLK-extracted features yielded 8 unlabeled style groupings. In six of eight clusters, the most common true painting style matched the cluster style assigned by cost analysis. The clustering had an F-score of 0.212 (no comparison painting clustering method is available at this time). For the semantic analysis, the featural characteristics of Baroque and Art Nouveau were found to be similar, indicating a relationship between these styles. Discussion/conclusion: The UFLK method can extract features from digitised paintings. We were able to extract characteristics of art without any prior information about the nature of the features or the stylistic designation of the paintings. The methods herein may provide art researchers with the latest computational techniques for the documentation, interpretation, and forensics of art. The tools could assist the preservation of culturally sensitive works of art for future generations, and provide new insights into works of art and the artists who created them.
AB - Objective: To create n system to aid in the analysis of art history by classifying and grouping digitized paintings based on stylistic features automatically learned without prior knowledge. Material and methods: 6,776 digitized paintings from eight different artistic styles (Art Nouveau, Baroque, Expressionism, Impressionism, Realism, Romanticism, Renaissance, and Post-Impressionism) were utilized to classify (predict) and cluster (group) paintings according to style. The method of unsupervised feature learning with K-means (UFLK), inspired by deep learning, was utilized to extract features from the paintings. These features were then used in: a support vector machine algorithm to classify the style of new test paintings based on a training set of paintings having known style labels; and a spectral clustering algorithm to group the paintings into distinct style groups (anonymously, without employing any known style labels). Classification performance was determined by accuracy and F-score. Clustering performance was determined by: the ability to recover the original stylistic groupings (using a cost analysis of all possible combinations of eight group label assignments); F-score; and a reliability analysis. The latter analysis used two novel ways to determine the distribution of the null-hypothesis: a uniform distribution projected onto the principal components of the original data; and a randomized, weighted adjacency matrix. The ability to gain insights into art was tested by a semantic analysis of the clustering results. For this purpose, we represented the featural characteristics of each painting by an N-dimensional feature vector, and plotted the distance between vector endpoints (i.e., similarity between paintings). Then, we color-coded the endpoints with the assigned lowest-cost style labels. The scatter plot was visually inspected for separation of the paintings, where the amount of separation between color clusters provides semantic information on the interrelatedness between styles. Results: The UFLK-extracted features resembled the edges/lines/colors in the paintings. For feature-based classification of paintings, the macro-averaged F-score was 0.469. Classification accuracy and F-score were similar/higher compared to other classification methods using more complex feature learning models (e.g., convolutional neural networks, a supervised algorithm). The clustering via UFLK-extracted features yielded 8 unlabeled style groupings. In six of eight clusters, the most common true painting style matched the cluster style assigned by cost analysis. The clustering had an F-score of 0.212 (no comparison painting clustering method is available at this time). For the semantic analysis, the featural characteristics of Baroque and Art Nouveau were found to be similar, indicating a relationship between these styles. Discussion/conclusion: The UFLK method can extract features from digitised paintings. We were able to extract characteristics of art without any prior information about the nature of the features or the stylistic designation of the paintings. The methods herein may provide art researchers with the latest computational techniques for the documentation, interpretation, and forensics of art. The tools could assist the preservation of culturally sensitive works of art for future generations, and provide new insights into works of art and the artists who created them.
KW - Art forensics
KW - Classification
KW - Clustering
KW - Painting styles
KW - Unsupervised feature learning
UR - http://www.scopus.com/inward/record.url?scp=85038885920&partnerID=8YFLogxK
U2 - 10.1016/j.culher.2017.11.008
DO - 10.1016/j.culher.2017.11.008
M3 - Article
AN - SCOPUS:85038885920
SN - 1296-2074
VL - 31
SP - 13
EP - 23
JO - Journal of Cultural Heritage
JF - Journal of Cultural Heritage
ER -