30 Dec 2022

98

Applications of Data Science in Cybersecurity

Format: APA

Academic level: Master’s

Paper type: Research Paper

Words: 2616

Pages: 10

Downloads: 0

Introduction 

In its most essential form, data science, as applicable to the contemporary society, involves the study, processing, and deriving valuable information understanding different concepts. Professionals from different industries use the generated understanding to transform their respective fields. Since the contemporary society is driven by information, the valuable insights gleaned from data are creating novel ways of doing business. While conceptually previous generations of the decision-support technologies, the analytics systems improved through data science focus on recognizing patterns and prediction, rather than making decisions based on historical reporting. In this regard, the systems leverage the coverage of two fundamental technological developments. Firstly, the concept of big data is projected to continue and assist in predicting emerging trends. Secondly, the technologies employ analytics techniques from reporting data to understand different trends that can assist in predicting occurrences or behavior. The analytic techniques optimize different learning systems that can be adaptive to changing conditions. 

The continued improvement and advancement in data science and machine learning makes it more relevant to fields such as data security. In this regard, data scientists have the ability to apply their acquired knowledge to cybersecurity to assist in protecting attacks, including the identification of suspicious behavior. These elements form the foundation of cybersecurity, which involve the identification of threats, stopping attacks or intrusion, preventing fraud, and identifying spam and malware (Robinson, Gribbon, Horvath, & Cox, 2013). In the light of fulfilling the identified objectives, this paper focuses on assessing the application of data science to cybersecurity. Through data science, it can be possible to identify anomalies or abnormalities from an intrusion, which calls for the application of preventive measures that can reduce the severity of possible intrusions. 

It’s time to jumpstart your paper!

Delegate your assignment to our experts and they will do the rest.

Get custom essay

The Application of Data, Analytics, and Cybersecurity 

It would be challenging to ensure the seamless interaction between data, analytics and cybersecurity. The difficulty primarily emanates from the different attack vectors that should be considered, including huge amounts of data that one should go through to acquire the right insights for ensuring data security (Gillon, Aral, Lin, Mithas & Zozulia, 2014). Corinium (2018) refers to a study by Verizon to relay that the prevention of a single kind of data security attack is inadequate, since attackers are capable of using different techniques while executing their campaigns. The study estimated that slightly more than 60% of the attacks emanate from hackers, slightly more than 50% from the use of malware, while slightly more than 40% using the social media in deploying their attacks (Corinium, 2018). A slight percentage of the attacks emanate from mistakes made by the employees of an organization, as some of their online actions can poke holes into the security that might have been exploited. 

An emerging trend used by cyber criminals is the idea of using artificial intelligence within their systems to phish personalized emails, scale attacks, change malware as well as ransomware, and identify the vulnerabilities within a given system (Brundage et al., 2016). They can do so in real time. Through artificial intelligence, the cyber criminals can outsmart complicated attacks that might require skilled cybersecurity personnel to monitor networks for a considerable number of threats. On some occasions, the threats might not be following traditional patterns of cybersecurity threats. The use of data to stay ahead of cybersecurity threats can be a workable idea (Adams & Heard, 2014). In this regard, enterprises have a significant amount of data at their disposal, which emanate from different sources that include server logs as well as their network infrastructure. The data can increase in enormous sizes. However, the threat response teams in the enterprise should run queries more often and in real time, as they find suspicious threats. They are also responsible to running the queries against the firm’s historical data sets and large streaming to identify the extent of the suspected data breach (Holsopple, Yang, & Sudit, 2006). The detailed analysis is essential for finding threats, finding it, and discounting it, which is an indication that the company should have enough processing power to analyze the huge amounts of data in seconds (Holsopple, Yang, & Sudit, 2006). 

How Big Data Can Assist In Avoiding Cybersecurity Threats 

Never-ending cybersecurity threats make it difficult for organizations to sustain the performance and the growth of their business. In this light, cybersecurity can be regarded as one of the biggest challenges that organizations face in their quest to protect business information against hacking and malware. Big data can store an enormous amount of data that analysts can use while examining, observing, and identifying irregularities that might occur in a network. Considering the sophistication of malware attacks in recent years, it would be possible to consider big data analytics as one of the most suitable ways that can assist in preventing and escaping cybercrimes (Mahmood & Afzal, 2013). The security-related information from big data can be used to reduce the period needed for detecting and resolving particular issues, which makes it possible for cyber analysts to predict or prevent possibilities of network invasion and intrusion. According to a report from CSO online (2018), more than 84% of organizations employ big data to assist in blocking cyber attacks. This could be indicative of the idea that the introduction of big data analytics in an organization could be a beneficial way to reduce cyber threats, consequently allowing companies to work towards the achievement of their predetermined goals and objectives. 

Firms can now use big data analytic tools to detect cybersecurity threats, which include malicious inside programs, malware attacks, and compromised devices in a system (Mahmood & Afzal, 2013). Big data analytics is seemingly promising to improving cybersecurity threats on these fronts. However, a point of concern might relate to whether it is possible for a firm to remain protected during each day of doing business. To answer this question, it would be essential to rely on the responses provided for by the respondents of the study done by CSO online. Most of the respondents indicated that they could not use the power presented by big data analytics to its full capacity (CSO Online, 2018). Some of the reasons they provided for their reservations include the presence of voluminous data, which might be overwhelming. The other reasons include the idea that they might not have access to the most appropriate tools, systems, experts, and that they might be working with obsolete data (CSO Online, 2018). The reasons could be used to prove that big data does not provide full proof security, as the absence of expertise and poor mining can limit the ability to use the analytics to fix the existing gap. 

The other way through which big data can assist in avoiding cybersecurity threats is in intelligence risk management. For an entity to improve its cybersecurity efforts, it would be vital to use tools backed by insights related to intelligent risk management (Tannam, 2018). Big data experts should be in a position to interpret the tools easily. The fundamental purpose of using the automated tools emanates from the idea that they should avail the data to the experts for quick and easy interpretation. In this light, the approach allows the data science analysis to source, categorize, and take care of the possible threats faster. In addition to enabling intelligent risk management, it is possible to visualize possible threats through big data. In this case, big data analytic programs can assist data science experts to foresee the intensity and the class of cybersecurity threats (Assunção et al., 2015). Through this capability, experts can be in a position of weighing the complexity of an attack. They can do so by evaluating the sources and patterns of data availed (Assunção et al., 2015). The programs also make it possible for them to use the current and the historical data to derive statistical understanding of the accepted and rejected trends. 

According to Assunção et al. (2015), intelligent big data analytics can enable data scientists to create a predictive model that can provide an alert immediately after an entry point of an attack is identified. Developing a mechanism with this capability is possible through artificial intelligence or machine learning. In addition to ensuring that their systems remain secure, firms can stay ahead of workers through penetration testing, which is availed through big data tools (Assunção et al., 2015). Data science experts can conduct infrastructure penetration testing to acquire insights for the database and the processes of a system, which is necessary for ensuring that hackers are at bay. According to Benjamin (2010), penetration testing can be understood as a simulative malware attack against a system that assists in the exploitation of vulnerabilities. The exercise is vital for identifying the vulnerabilities existing in the system, including the capabilities of the applied processes and the available analytics solutions. A firm can protect its IT infrastructure and business data by applying penetration testing. 

The stages considered during penetration testing include the planning and reconnaissance stage, scanning, accessing, maintaining the access, and conducting an analysis as well as web application firewall (WAF) configuration (Allen & Cardwell, 2016). The results obtained after carrying out the different steps can be used in enhancing the fortification of the system through the improvement of the WAF security policies (Peterson, 2007). After the firm configures the policies, including the strengthening of the processes, it will be in a position of gauging the effectiveness of the implemented preventive measures through conducting a new penetration test. However, it a valid point to note is that vulnerabilities in an infrastructure can go unnoticed, meaning that they cannot be managed. Such vulnerabilities exist in different places such as the end-user behavior, application flaws, and improper configurations, among other avenues. 

In addition to the identified vulnerabilities, several challenges still exist when deciding to use big data to prevent cybersecurity threats, primarily because threats come up each day. more than half of the respondents interviewed in the CSO Online (2018) survey provide that they use big data analytics as a strategic provision, while a few of them use it in a limited capacity. Even with the statistical provisions, it would be possible to argue that the agencies in which the respondents work for have been compromised at least once each month, presumably because they might have not been able to analyze the data available fully. Of those that indicated that their efforts were ineffective, 49% indicated that the overwhelming volume of data was the main cause, 33% indicated that they did not have the appropriate tools, while 30% indicated that the full analysis of the data was not possible because of its staleness once it gets to the cybersecurity manager (CSO Online, 2018). The statistics reveal that big data might have its flaws in relation to threat analysis, which is primarily a derivative of poor mining. Even though metadata is available, deriving maximum benefit from it might be challenging. On the other hand, the issue might involve getting the people with the most appropriate skills to mine the data available for trends. 

Making Cybersecurity Decisions Using Entity Data 

It is essential for system defenders to make cybersecurity decisions based on the short-term as well as long-term consequences, which can only be derived from the assessment of risks. The foundation of the identified risk can be based on the reliability of data analysis, which forms the foundation of trustworthy data. According to Kantarcioglu et al. (2016), as natural sciences, data science can involve conducting experiments that can be made available for repetition and generalization. For this reason, data scientists are obliged to conduct studies that can provide comprehensive solutions needed for the assessment and assurance of the trustworthiness of the collected information needed for cybersecurity (Kantarcioglu et al., 2016). Decision-makers and analysts in an organizations can make decisions based on the inaccurate data they might received, thereby resulting to the increase in the vulnerability of a system. For this reason, data science is vital for developing a framework that can secure a system, including sharing the derived data to develop strategies and systems to be implemented in cybersecurity. 

Through data collected from data science, cybersecurity experts can be well-positioned to model defensive strategies that will make it difficult for intruders to attack a given system. Just as the attack models used by intruders, decision-makers should develop a defensive strategy based on information regarding its applicability, effectiveness, and costs (Kantarcioglu et al., 2016). For this reason, data science can assist in exploring different classes of the applicable defensive strategies, thereby creating profiles representing the respective properties. Given the diversity of the properties considered, the defensive strategies should be based on vital aspects of a network. This requires an understanding of the computational as well as the power needs of the strategies, which can be derived from data science. For instance, some defense strategies might not be applicable to particular nodes when they are costly and might require too much power to execute. 

Data science can be useful in analyzing the optimal defense strategy that a firm should use, based on the anticipatory provisions of the different classes of attack (Kantarcioglu et al., 2016). For this reason, analysts can consider the network system to be protected as a single entity, consequently employing appropriate strategies that can fix the severity that might be derived from a specific attack. The entity considered corresponds to the command center of the considered network system, which defines the strategy that would be the most applicable for protecting the assets of the firm, including reducing the severity of a possible attack. For this reason, addressing the changing dynamics of attack parameters requires data science, as the defense strategies obtained from experiments can be defined for the each of the network nodes. 

Government institutions and businesses face decisions to improve their cybersecurity spaces each time. However, considering the limited resources at their disposal, given the direct costs as well as the opportunity costs, the objectives of the institutions might be impeded. For this reason, the decision to be made regarding the amount of investments to improve the spaces is one of the most vital challenges that the decision makers in the organizations face. The rarity of data-driven decision-making tools in the contemporary society force the institutions to make decisions that are primarily based on the regulatory requirements, peer benchmarking, as well as claims derived from product marketing (Fisk, 2018). The decisions are increasingly made difficult by the increasing number of pathways that hackers can use to fulfill their goal, including the lack of assurances in terms of the defenses that can block the pathways. Based on the barriers created by the limited resources, Fisk (2018) suggests the fundamental objective for cybersecurity policies should be based on the idea that the marginal cost incurred to a cyber security offense should be higher than the marginal cost of making decisions to improve cybersecurity investments. 

The proposal presented above could be related to the idea that a government or business entity can spend a considerable amount of money on making decisions that can cost the institution more than the intruder can. This cost-based provision accounts for the idea that the return on investments construct is viable, even though it might be difficult for the organization involved to estimate the financial value that a possible breach might bring forth. The difficulty is exacerbated by considerations of protecting the brand and maintaining trust among the company’s customer (Fisk, 2018). The implementation of applications such as big data analysis might be costly, which makes it difficult to understand the difficulty of ensuring the effectiveness of the security controls of a firm based on the return on investment construct. The challenge primarily emanates from the idea that a well resourced or a persistent intruder can succeed when the defense systems of the organizations are weak, primarily because of limited funding. In light of this challenge, organizations should use data derived from an analysis of their systems to prioritize the defenses deemed necessary, including those that will work within the resources available for executing system defenses. 

Conclusion 

Data science can assist in identifying anomalies or abnormalities from an intrusion, which calls for the application of preventive measures that can reduce the severity of possible intrusions. The need to ensure the security of an entity’s network systems emanates from the complexity and heterogeneity of the cyberspace, which provides cybercriminals with the opportunity to use their skills to intrude a company’s network to conduct malicious activities. In the light of the need to use data science applications for cybersecurity, components covered include the application of data and analytics, the extent with which big data analysts can use different computing tools to enhance cybersecurity, and using data science to inform cybersecurity strategies. The strategies employed by an organization depend on the applicability, effectiveness, and costs of a developed defense systems. 

References 

Adams, N., & Heard, N. (2014).  Data analysis for network cyber-security . World Scientific Publishing Co., Inc. 

Allen, L., & Cardwell, K. (2016).  Advanced Penetration Testing for Highly-Secured Environments . Packt Publishing Ltd. 

Assunção, M. D., Calheiros, R. N., Bianchi, S., Netto, M. A., & Buyya, R. (2015). Big Data computing and clouds: Trends and future directions.  Journal of Parallel and Distributed Computing 79 , 3-15. 

Benjamin, P. (2010).  U.S. Patent No. 7,784,099 . Washington, DC: U.S. Patent and Trademark Office. 

Brundage, M., Avin, S., Clark, J., Toner, H., Eckersley, P., Garfinkel, B., et al. (2018). The malicious use of artificial intelligence: Forecasting, prevention, and mitigation.  arXiv preprint arXiv:1802.07228

Corinium. (2018). Data, Analytics and Cyber Security: How Can They Work Together Harmoniously? Retrieved 24 July 2019, from https://www.coriniumintelligence.com/insights/data-analytics-and-cyber-security-how-can-they-work-together-harmoniously/ 

CSO Online. (2018). How Big Data is Improving Cyber Security. Retrieved from https://www.csoonline.com/article/3139923/how-big-data-is-improving-cyber-security.html 

Fisk, M. (2018). Data-Driven Decision Making for Cyber Security. In N. Heard, N. Adams, P. Rubin-Delanchy & M. Turcotte,  Data Science for Cyber-Security (Vol. 3) . London; Hackensack, NJ: World Scientific Publishing Europe Ltd. 

Gillon, K., Aral, S., Lin, C., Mithas, S., & Zozulia, M. (2014). Business Analytics: Radical Shift or Incremental Change?  Communications of the Association for Information Systems 34 . Doi: 10.17705/1cais.03413 

Holsopple, J., Yang, S. J., & Sudit, M. (2006). TANDI: Threat assessment of network data and information. In  Multisensor, Multisource Information Fusion: Architectures, Algorithms, and Applications 2006  (Vol. 6242, p. 62420O). International Society for Optics and Photonics. 

Mahmood, T., & Afzal, U. (2013). Security Analytics: Big Data Analytics for cybersecurity: A review of trends, techniques and tools.  2013 2Nd National Conference on Information Assurance (NCIA) . Doi: 10.1109/ncia.2013.6725337 

Peterson, G. (2007). Security architecture blueprint .  Arctec Group, LLC . 

Robinson, N., Gribbon, L., Horvath, V., & Cox, K. (2013). Cyber-security threat characterisation. Retrieved from https://www.rand.org/pubs/research_reports/RR235.html 

Tannam, E. (2018). Data science is changing how cybersecurity teams hunt threats. Retrieved 24 July 2019, from https://www.siliconrepublic.com/enterprise/data-science-cybersecurity 

Thuraisingham, B., Kantarcioglu, M., Hamlen, K., Khan, L., Finin, T., Joshi, A., Oates, T., & Bertino, E. (2016). A data driven approach for the science of cyber security: Challenges and directions. In  2016 IEEE 17th International Conference on Information Reuse and Integration (IRI)  (pp. 1-10). IEEE. 

Illustration
Cite this page

Select style:

Reference

StudyBounty. (2023, September 14). Applications of Data Science in Cybersecurity.
https://studybounty.com/applications-of-data-science-in-cybersecurity-research-paper

illustration

Related essays

We post free essay examples for college on a regular basis. Stay in the know!

Security Implication of the Internet of Things

The Internet of Things (IoT) can be described as s system of interconnected devices that have the ability to transfer information over a computer network without the need of human-to-computer or human-to-human...

Words: 892

Pages: 3

Views: 96

Modern Day Attacks Against Firewalls and VPNs

Introduction The need to have an enhanced security of the computer connectivity happens to be one of the reasons that attract companies and organizations towards wide usage of VPNs. Several simple techniques...

Words: 2025

Pages: 7

Views: 134

How to Deploy and Administer Windows Server 2012

Securing a reliable, and expandable configuration for a company is important to build a strong network. The new and enhanced features of the Windows Server 2012 can be used to implement the network. In this...

Words: 1673

Pages: 6

Views: 87

Deployment Model in Cloud Computing

Deployment model is a representation of a cloud environment primarily distinguished by parameters such as accessibility, proprietorship, and storage size. The National Institute of Standards and Technology gives the...

Words: 254

Pages: 1

Views: 81

How to Use Web Search Engines for Business Research

The advancement of technology has made it possible for many people around the world to have easy access to information whenever they want. The development of the Wide World Web-enabled different kinds of information...

Words: 773

Pages: 3

Views: 86

Distributed Database Management System (DDBMS)

Introduction Data management has been a headache to many technology enthusiasts for quite a long period of time. They have successfully managed to logically collect interrelated data and share it. If the data is...

Words: 799

Pages: 3

Views: 127

illustration

Running out of time?

Entrust your assignment to proficient writers and receive TOP-quality paper before the deadline is over.

Illustration