William Qian's Portfolio

Publications

2024In Progress

Stavroula A. Chrysanthopoulou, Zehao Qian and Wanyi Chen, "Emulators for cohort-based, Markov state, simulation models: an application to the RESPOND model for Opioid Use Disorder (OUD)"

Purpose: In this study we present an innovative idea and unique example of building emulators for cohort-based Markov simulation models.

Methods: We have developed the Researching Effective Strategies to Prevent Opioid Death (RESPOND) simulation model to characterize the Opioid Use Disorder (OUD) dynamics, make projections, evaluate interventions, and inform decision making in this area. Due to its complex structure, implementation of RESPOND can be computationally intensive. The objective of this study is to build an emulator (metamodel), namely a simulation model of a simpler structure, to efficiently map model inputs to outputs based on a calibrated version of the RESPOND model to observed OUD trends in Massachusetts. We explore three statistical approaches for an emulator of the RESPOND model; 1) a generalized linear model (GLM) for longitudinal data, 2) a Mixed-Effects Random Forest (MERF), and 3) a Long Short-Term Memory (LSTM) recurrent neural network model. We describe model selection procedures for determining the best fitted model (set of parameter values) of each model category separately to the available simulated RESPOND data. We compare the three approaches in terms of their accuracy (weighed Root Mean Squared Error (wRMSE) and Mean Absolute Error (MAE)) and efficiency (running time) using simulated data (overall and fatal overdose counts) from the calibrated RESPOND model.

Results/Conclusions: Our findings show that the LSTM model provide more accurate predictions, while GLMs can provide insightful details about observed trends and associations of key factors over time. We also discuss how emulators can be used to create an online, user-friendly applications to depict evaluation of health care alternatives in real time due to the efficiency of the underlying algorithms.

Keywords: GEE, GLME, MERF, LSTM, Model Tuning

2022Published

J. Chen, Z. Hu and Z. Qian, "Research on Malicious URL Detection Based on Random Forest," 2022 14th International Conference on Computer Research and Development (ICCRD), Shenzhen, China, 2022, pp. 30-36, doi: 10.1109/ICCRD54409.2022.9730451.

Malicious URLs have become serious threats to cybersecurity, also forming incubators for Internet criminal activities. With visiting malicious URLs, visitors may undergo illegal actions such as spamming, phishing and drive-by downloads which seriously threat visitors' privacy and security that cause losses of billions of dollars every year. Traditional methods such as using URL blacklists to detect malicious URLs can classify most of the known URLs but are poorly effective when processing newly generated ones. To forestall greater economic losses, it is imperative to exert a method that can classify URLs in a timely manner. To improve timeliness of detecting malicious URLs, we use machine learning algorithms to automatically classify URLs. In this article, we selected the experiment results of several common machine learning models on our data set as the baseline and compared them horizontally with the outcome of random forest classifier. After that, we optimize the classifier to make the random forest classifier to achieve the best outcome within the lowest complexity.

Keywords: Malicious URLs Detection, Machine Learning, SVM, KNN, XGBoost, Decision Tree, Random Forest