top

Asiasn review of Financial research

Past Issues

HOMEPast Issues Past Issues

Asian Review of Financial Research Vol.38 No.4 pp.51-81 https://www.doi.org/10.37197/ARFR.2025.38.4.2
Financial Information-Based Corporate Credit Rating Model Using Graph Neural Network
Myung Jun Kim 포항공과대학교 산업경영공학과 통합과정
Taegyum Kim 포항공과대학교 산업경영공학과 석사과정
Youngju Ahn 포항공과대학교 산업경영공학과 통합과정
Bong-Gyu Jang* 포항공과대학교 산업경영공학과 교수
Key Words : Credit Rating Models,Financial Raitos,Machine Learning Model,Graph Neural Network,Industrial Structure

Abstract

Credit rating is an essential component of the financial sector, as it evaluates a firm's capacity to meet debt obligations. In the Republic of Korea, rating agencies assign levels ranging from AAA to D, with additional modifiers, and these ratings significantly affect financing conditions. Traditional methods typically rely on logistic regression and selected financial variables, yet these approaches often face difficulties in capturing the intricate or nonlinear patterns present in corporate financial data. In response to these challenges, researchers have increasingly turned to advanced machine learning algorithms that can account for more complex relationships. Nevertheless, their deployment is limited by relatively small datasets—particularly among smaller firms—and by concerns regarding model interpretability. The present study proposes a machine learning framework, including a Graph Neural Network (GNN), to predict rating changes in Korean firms. The data set spans 2010 to mid-2024 and includes 182 firms, yielding 1,417 year-level samples. Each annual observation is labeled according to whether its credit rating was upgraded (+1), downgraded (–1), or left unchanged (0). Because most observations lie in the unchanged category, the data are highly imbalanced. To address this imbalance, the Synthetic Minority Over-sampling Technique (SMOTE) is employed, generating additional samples in the minority classes. In addition, Principal Component Analysis (PCA) is utilized to reduce the dimensionality of sixteen indicators representing profitability, growth, stability, and activity. Six models are assessed: logistic regression, LASSO, random forest, support vector machine, Light Gradient Boosting Machine (LGBM), and a GNN. The GNN-based approach is noteworthy for modeling financial indicators or corporate attributes as nodes within a graph, with edges delineating the relationships among these nodes. Such a representation allows for the capture of latent dependencies that are difficult to detect in methods that treat predictors independently. Furthermore, each corporate sample belongs to discrete subgroups determined by listing market (KOSPI or KOSDAQ), data period, rating band, and industry classification. The proposed hierarchical GNN merges these subgroup-specific networks to produce a consolidated prediction, thereby incorporating both firm-level attributes and group-level characteristics. When compared to a benchmark that classifies samples randomly in proportion to the observed class distribution, all six machine learning algorithms demonstrate superior performance. The GNN shows the highest precision and F1-scores, suggesting that it is particularly effective at identifying upgrades and downgrades, which are far less common than no rating changes. Nonetheless, like other models, it finds rare rating shifts more challenging to predict, highlighting the impact of data imbalance and the difficulty of forecasting uncommon events. An inspection of feature importance across models underscores the significance of growth and activity metrics, implying that sales expansion, equity growth, and the efficient use of assets offer robust signals of rating volatility. Moreover, the GNN indicates that distinguishing firms by industry group is especially influential, possibly because each sector's distinctive regulatory, economic, and financial traits shape its credit risk profile. Compared to certain deep neural networks that demand extensive datasets, the GNN-based method presented here is relatively more practical in settings with limited data, including smaller firms with incomplete rating histories. Additionally, this approach provides improved transparency, as the graph architecture clarifies how different financial indicators or subgroups collectively affect rating transitions. Future work may benefit from enlarging the dataset, experimenting with alternative oversampling strategies such as ADASYN, and examining cost-sensitive learning to mitigate the imbalance problem further. Investigations might also consider alternative graph structures that connect entire firms as nodes and delineate inter-firm relationships or incorporate advanced architectures such as Transformers or LSTM networks. In summary, the findings suggest that a GNN-based framework can improve credit rating predictions by capturing complex interactions that traditional or other advanced machine learning methods may overlook. While data imbalance is still a problem, its consequences are somewhat mitigated by SMOTE. The significance of growth, activity, and sector-specific characteristics suggests that more accurate and comprehensible rating projections can be produced by integrating richer and more interconnected data. In the end, more investigation and more extensive data gathering should improve the precision and dependability of credit rating systems, leading to a better comprehension of the dynamics of corporate finance.
LIST
Export citation