تحلیل ایستای ساختار فایل اجرایی جهت شناسایی و خوشه‌بندی بدافزارهای ناشناخته

نوع مقاله : ترویجی

نویسندگان

1 کارشناس ارشد فناوری اطلاعات، پژوهشگر، دانشگاه جامع امام حسین(ع)، تهران، ایران

2 پژوهشگر، دانشگاه جامع امام حسین(ع)، تهران، ایران

چکیده .

یکی از روش‌های محبوب شناسایی بدافزار، تطبیق الگوی امضای فایل بدافزار با پایگاه داده امضای بدافزارها است. پایگاه داده امضای بدافزار از قبل استخراج شده و به‌طور مداوم به‌روزرسانی می‌گردد. بررسی شباهت داده‌های ورودی با بهره‌گیری از امضاهای ذخیره شده موجب بروز مشکلات ذخیره‌سازی و هزینه محاسبات می‌گردد. علاوه بر این، شناسایی مبتنی بر تطبیق الگوی امضای بدافزاری در زمان تغییر کد بدافزار در بدافزارهای چند ریخت، با شکست مواجه می‌شود. در این مقاله با ترکیب روش تحلیل ایستای ساختار فایل اجرایی و الگوریتم‌های یادگیری ماشین، روش مؤثری جهت شناسایی بدافزارها ارائه شده است. مجموعه داده برای آموزش و ارزیابی روش پیشنهادی شامل 36567 نمونه بدافزاری و 17295 فایل بی‌خطر است و در روش پیشنهادی، بدافزارها را در 7 خانواده، خوشه‌بندی می‌نماید. نتایج نشان می‌دهد که روش پیشنهادی قادر است با دقت بیش از 99 درصد و با نرخ هشدار اشتباه کمتر از 4/0 درصد بدافزارها را از فایل‌های سالم تشخیص و خوشه‌بندی نماید. روش پیشنهادی نسبت به روش‌های مشابه، دارای سربار‌های پردازشی بسیار کم بوده و مدت زمان پویش فایل‌های اجرایی به‌طور متوسط 244/0 ثانیه طول است.

کلیدواژه‌ها


عنوان مقاله [English]

Static Analysis of the Executable File Structure to Detect and Cluster Unknown Malware

نویسندگان [English]

  • H. Tanha 1
  • M. Abbasi 2
1 ihu
2 IHU
چکیده . [English]

One of the most popular ways to detect malware is to find a match for malware file signature pattern in the malware signature database. The malware signature database is pre-extracted and is constantly updated. Checking the similarity of input data using the stored signatures causes storage problems and increases the calculation costs. In addition, the detection based on adapting the malware signature pattern fails when changing the malware code in polymorphic malware. In this paper, by combining the static analysis of executable file structure and the machine learning algorithms, an effective method for malware detection is presented. The data set for training and evaluation of the proposed method includes 36,567 samples of malware and 17295 benign files, and the malware is clustered in 7 families. The results show that the presented method is able to detect and cluster malware from benign files with an accuracy of more than 99% and a false positive rate less than 0.4%. The proposed method has very low processing overheads compared to similar methods and the average scanning time of executable files is 0.244 second.

کلیدواژه‌ها [English]

  • Malware Detection
  • Executable File Structure
  • Static Analysis
  • Clustering
  • Machine Learning

Smiley face

  • “Malware Statistics & Trends Report| AV-TEST,” https://www.av-test.org/en/statistics/malware/ (Accessed Nov. 25, 2021.
  • Afshar, A. Termechi, A. Golshan, A. Aghayan, H. R. Shahriari, and S. Soleimani, “Review of the Types of Strategies to Improve Security of Industrial Control Systems and Critical Infrastructure,” Passiv. Def. Q., vol. 9, no. 2, pp. 1–9, 2018.
  • Kaushal, P. Swadas, and N. Prajapati, “Metamorphic Malware Detection Using Statistical Analysis,” Int. J. Soft Comput. Eng., vol. 2, no. 3, pp. 49–53, 2012.
  • P. Nair, H. Jain, Y. K. Golecha, M. S. Gaur, and V. Laxmi, “Medusa: Metamorphic Malware Dynamic Analysis Usingsignature from API,” in Proc. of the 3rd Int. Conf. on Security of Information and Networks, pp. 263–269, 2010.
  • S. Veerappan, P. L. K. Keong, Z. Tang, and F. Tan, “Taxonomy on Malware Evasion Countermeasures Techniques,” In IEEE World Forum on Internet of Things, WF-IoT- Proceedings, pp. 558–563, May 2018.
  • Saxe and K. Berlin, “Deep Neural Network Based Malware Detection Using Two Dimensional Binary Program Features,” In Malicious and Unwanted Software (MALWARE), 10th Int. Conf. on, pp. 11–20, 2015.
  • Ye, D. Wang, T. Li, D. Ye, and Q. Jiang, “An Intelligent PE-Malware Detection System Based on Association Mining,” J. Comput. Virol., vol. 4, no. 4, pp. 323–334, 2008.
  • “PE Format - Win32 APPS | Microsoft Docs.” https://docs.microsoft.com/en-us/windows/win32/debug/pe-format (accessed Nov. 25, 2021).
  • Gibert, C. Mateu, and J. Planes, “The Rise of Machine Learning for Detection and Classification of Malware: Research Developments, Trends and Challenges,” J. Netw. Comput. Appl., Mar. 2020.
  • Belaoued and S. Mazouzi, “A Real-Time Pe-Malware Detection System Based on Chi-Square Test and Pe-File Features,” In IFIP Int. Conf. on Comput. Sci. and its App., pp. 416–425, 2015.
  • -H. Lin, H.-K. Pao, and J.-W. Liao, “Efficient Dynamic Malware Analysis Using Virtual Time Control Mechanics,” Comput. Secur., vol. 73, no.? pp. 359–373, 2018.
  • Afianian, S. Niksefat, B. Sadeghiyan, and D. Baptiste, “Malware Dynamic Analysis Evasion Techniques: A Survey,” CoRR, vol. abs/1811.0, 2018.
  • L. C. Candás, V. Peláez, G. López, M. Á. Fernández, E. Alvarez, and G. Díaz, “An Automatic Data Mining Method to Detect Abnormal Human Behaviour Using Physical Activity Measurements,” Pervasive Mob. Comput., vol. 15, pp. 228–241, 2014.
  • G. Schultz, E. Eskin, F. Zadok, and S. J. Stolfo, “Data Mining Methods for Detection of New Malicious Executables,” In Proc. 2001 IEEE Symp. on Security and Privacy, S&P 2001, pp. 38–49, 2000.
  • Gao, G. Yin, Y. Dong, and L. Kou, “A Research on the Heuristic Signature Virus Detection Based on the PE Structure,” 2013.
  • Alirezaei, “Behavioral Analysis of Malicious Code,” Kish Paradise Univ. of Tehran, Kish, 2011.
  • H. Sung, J. Xu, P. Chavez, and S. Mukkamala, “Static Analyzer of Vicious Executables (Save),” In 20th Annual Comput. Security App. Conf., pp. 326–334, 2004.
  • Weber, M. Schmid, M. Schatz, and D. Geyer, “A Toolkit for Detecting and Analyzing Malicious Software,” In 18th Annual Computer Security App. Conf., 2002. Proc., pp. 423–431, 2002.
  • -Y. Wang, S.-J. Horng, M.-Y. Su, C.-H. Wu, P.-C. Wang, and W.-Z. Su, “A Surveillance Spyware Detection System Based on Data Mining Methods,” In 2006 IEEE Int. Conf. on Evolutionary Computation, pp. 3236–3241, 2006.
  • [M. M. Masud, L. Khan, and B. Thuraisingham, “A Scalable Multi-Level Feature Extraction Technique to Detect Malicious Executables,” Inf. Syst. Front., vol. 10, no. 1, pp. 33–45, 2008.
  • “Inc, V. Malware Sample.” https://virusshare.com/ (Accessed Nov. 25, 2019).
  • “VirusSign | Malware Research & Data Center, Threat Intelligence, Free Downloads.” https://www.virussign.com/ (Accessed Nov. 25, 2021).
  • “GitHub - ocatak/malware_api_class: Malware Dataset for Security Researchers, Data Scientists. Public Malware Dataset Generated by Cuckoo Sandbox Based on Windows OS API Calls Analysis for Cyber Security Researchers.” https://github.com/ocatak/malware_api_class (Accessed Nov. 25, 2021).
  • S. Anderson and P. Roth, “Ember: An Open Dataset for Training Static Pe Malware Machine Learning Models,” arXiv Prepr. arXiv1804.04637, 2018.
  • Dube, R. Raines, G. Peterson, K. Bauer, M. Grimaila, and S. Rogers, “Malware Target Recognition via Static Heuristics,” Comput. Secur., vol. 31, no. 1, pp. 137–147, 2012.
  • Demme et al., “On the Feasibility of Online Malware Detection with Performance Counters,” ACM SIGARCH Comput. Archit. News, vol. 41, no. 3, pp. 559–570, 2013.
  • S. Han, J. H. Lim, B. Kang, and E. G. Im, “Malware Analysis Using Visualized Images and Entropy Graphs,” Int. J. Inf. Secur., vol. 14, no. 1, pp. 1–14, 2015.
  • Baysa, R. M. Low, and M. Stamp, “Structural Entropy and Metamorphic Malware,” J. Comput. Virol. hacking Tech., vol. 9, no. 4, pp. 179–192, 2013.
  • Ravi and R. Manoharan, “Malware Detection Using Windows Api Sequence and Machine Learning,” Int. J. Comput. App., vol. 43, no. 17, pp. 12–16, 2012.
  • G. Sundarkumar, V. Ravi, I. Nwogu, and V. Govindaraju, “Malware Detection via API Calls, Topic Models and Machine Learning,” In IEEE Int. Conf. on Automation Sci. and Eng., vol. 2015-Octob, pp. 1212–1217, 2015.
  • Fu, J. Pang, R. Zhao, Y. Zhang, and B. Wei, “Static Detection of Api-Calling Behavior from Malicious Binary Executables,” In 2008 Int. Conf. on Comput. and Elect. Eng., pp. 388–392, 2008.
  • Abraham and I. Chengalur-Smith, “An Overview of Social Engineering Malware: Trends, Tactics, and Implications,” Tech. Soc., vol. 32, no. 3, pp. 183–196, 2010.
  • -S. Kim, W. Jung, S. Kim, S. Lee, and E. T. Kim, “Evaluation of Image Similarity Algorithms for Malware Fake-Icon Detection,” In 2020 Int. Conf. on Information and Communication Tech. Convergence (ICTC), pp. 1638–1640, 2020.
  • Chen, T. Li, M. Abdulhayoglu, and Y. Ye, “Intelligent Malware Detection Based on File Relation Graphs,” In Proc. of the 2015 IEEE 9th Int. Conf. on Semantic Computing (IEEE ICSC 2015), pp. 85–92, 2015.
  • Parsa and F. Jamshidinia, “An Approach to Rootkit Detection Based on Virtual Machine Introspection,” Passiv. Def. Q., vol. 10, no. 2, pp. 33–42, 2019.
  • Lau and V. Svajcer, “Measuring Virtual Machine Detection in Malware Using DSD Tracer,” J. Comput. Virol., vol. 6, no. 3, pp. 181–195, 2010.
  • Huang, U. Verma, C. Fralick, G. Infantec-Lopez, B. Kumar, and C. Woodward, “Malware Evasion Attack and Defense,” pp. 34–38, 2019,
  • R. A. Grégio, V. M. Afonso, D. S. F. Filho, P. L. de Geus, and M. Jino, “Toward a Taxonomy of Malware Behaviors,” Comput. J., vol. 58, no. 10, pp. 2758–2777, 2015.
  • Z. Kolter and M. A. Maloof, “Learning to detect Malicious Executables in the Wild,” in KDD-2004 - Proc. of the Tenth ACM SIGKDD Int. Conf. on Knowledge Discovery and Data Mining, pp. 470–478, 2004.
  • Siddiqui, M. C. Wang, and J. Lee, “Detecting Internet Worms Using Data Mining Techniques,” J. Syst. Cybern. Informatics, vol. 6, no. 6, pp. 48–53, 2009.