——Finalist Award(Top 2%) in the COMAP Mathematical Contest in Modeling (MCM)
Completed in collaboration with Molei Qin and Qijian Lv
About the process
In the COMAP’s Mathematical Contest in Modeling (MCM), our team chose to develop a pricing model for the Hong Kong second-hand sailboat market. During the intense four-day competition, we gathered global data on second-hand sailboat prices. By integrating the strengths and weaknesses of machine learning and multiple regression methods, we constructed a comprehensive second-hand sailboat evaluation model. My primary responsibility was the development of this model, breaking down the pricing issue into key factors and isolating the brand impact as a separate coefficient. Using the XGBoost algorithm, we conducted an in-depth analysis of the actual conditions of sailboats, effectively mitigating the influence of brand premiums. Additionally, I managed data collection, the construction of the brand model, algorithm coding, and participated in the graphical design of the paper, ensuring the consistency and professionalism of our overall thesis.
Figure: The Overall Structure of the Article.
Summary
This paper analyzes and studies the pricing and consistency of regional effects in the used sailboat market by constructing mathematical models and simulating the regional effects of Hong Kong (SAR) on sailboat listing prices. The main models used in this study include the Integrated Brand Premium Index Model, the Used Sailboat Pricing Model, and the Hong Kong (SAR) Simulated Pricing Model.
Specifically, we use the basic attributes of used sailboats and brand premium indices to construct an IBPI model to explore the impact of brand premiums on sailboat pricing. After data preprocessing and feature engineering, we conducted XGBoost regression analysis to establish the used sailboat pricing model and explain the impact of regional factors on listing prices. We demonstrated that regional effects are not consistent and resolved any practical and statistical significance of the regional effects identified. We also constructed the Hong Kong (SAR) simulated pricing model and explained the regional effects in Hong Kong.
For Q1, we construct the Used Sailboat Pricing Model using the IBPI and 21 important features such as GDP, GDP per capita, displacement, and year through XGBoost regression analysis, which effectively explains the listed price of used sailboats, with R-squared values exceeding 0.9 in estimation accuracy.
For Q2, we conducted hypothesis testing to discuss whether regional effects on listing prices are consistent. We ultimately obtain significant p-values less than 0.05, indicating that regional effects on listing prices are significant. Through correlation analysis, we found that regional effects vary across different sailboat models. This indicates that the factors influencing geographical regions are complex and require consideration of the interaction of other factors. Additionally, we explained the practical and statistical significance of regional effects.
For Q3, we construct the Hong Kong (SAR) Simulated Pricing Model to study the regional effects of Hong Kong (SAR) on sailboat listing prices. We discover the differences in the Hong Kong market for different types of vessels and analyze that the regional effects for monohull and catamaran boats are different.
For Q4, we found that the sailboat market in the United States is more developed and that more people are willing to buy high-value sailboats. Bavaria products have low premiums and high cost-effectiveness, while the Discovery brand has significantly overpriced products. We also explain the impact of herd effect, endowment effect, and calendar effect on the listed price of used sailboats.
Keywords: Used sailboats, Integrated Brand Premium Index, XGBoost regression analysis, pricing model, regional effect.